exoplanet is a toolkit for probabilistic modeling of transit and/or radial velocity observations of exoplanets and other astronomical time series using PyMC3. PyMC3 is a flexible and high-performance model building language and inference engine that scales well to problems with a large number of parameters. exoplanet extends PyMC3’s language to support many of the custom functions and distributions required when fitting exoplanet datasets. These features include:
All of these functions and distributions include methods for efficiently calculating their gradients so that they can be used with gradient-based inference methods like Hamiltonian Monte Carlo, No U-Turns Sampling, and variational inference. These methods tend to be more robust than the methods more commonly used in astronomy (like ensemble samplers and nested sampling) especially when the model has more than a few parameters. For many exoplanet applications, exoplanet (the code) can improve the typical performance by orders of magnitude.
exoplanet is being actively developed in a public repository on GitHub so if you have any trouble, open an issue there.
exoplanet doesn’t have a compiled components so it can be easily installed from source or by using pip.
The only required dependencies for exoplanet are NumPy, PyMC3 and AstroPy. These can be installed using conda or pip:
conda install numpy pymc3 astropy
or
pip install numpy pymc3 astropy
The source code for exoplanet can be downloaded and installed from GitHub by running
git clone --recursive https://github.com/dfm/exoplanet.git
cd exoplanet
python setup.py install
To run the unit tests, install the following dependencies using pip or conda
(you’ll need to use the conda-forge
channel to get starry):
conda install -c conda-forge numpy scipy astropy pymc3 pytest starry pip
pip install batman-package parameterized nose
and then execute:
py.test -v
All of the tests should (of course) pass. If any of the tests don’t pass and if you can’t sort out why, open an issue on GitHub.
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
The exoplanet package is mostly just glue that connects many other
ideas and software. In a situation like this, it can be easy to forget
about the important infrastructure upon which our science is built. In
order to make sure that you can easily give credit where credit is due,
we have tried to make it as painless as possible to work out which
citations are expected for a model fit using exoplanet by including a
exoplanet.citations.get_citations_for_model()
function that
introspects the current PyMC3 model and constructs a list of citations
for the functions used in that model.
For example, you might compute a quadratically limb darkened light curve
using starry
(via the
exoplanet.light_curve.StarryLightCurve
class):
import pymc3 as pm
import exoplanet as xo
with pm.Model() as model:
u = xo.distributions.QuadLimbDark("u")
orbit = xo.orbits.KeplerianOrbit(period=10.0)
light_curve = xo.LimbDarkLightCurve(u)
transit = light_curve.get_light_curve(r=0.1, orbit=orbit, t=[0.0, 0.1])
txt, bib = xo.citations.get_citations_for_model()
The exoplanet.citations.get_citations_for_model()
function would
generate an acknowledgement that cites:
The first output from
exoplanet.citations.get_citations_for_model()
gives the
acknowledgement text:
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:agol19, exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:kipping13, exoplanet:luger18, exoplanet:pymc3, exoplanet:theano}.
And the second output is a string with BibTeX entries for each of the citations in the acknowledgement text:
print(bib)
@misc{exoplanet:exoplanet, author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and Eric Agol and Geert Barentsen and Tom Barclay}, title = {dfm/exoplanet: exoplanet v0.2.1}, month = sep, year = 2019, doi = {10.5281/zenodo.3462740}, url = {https://doi.org/10.5281/zenodo.3462740} } @article{exoplanet:pymc3, title={Probabilistic programming in Python using PyMC3}, author={Salvatier, John and Wiecki, Thomas V and Fonnesbeck, Christopher}, journal={PeerJ Computer Science}, volume={2}, pages={e55}, year={2016}, publisher={PeerJ Inc.} } @article{exoplanet:theano, title="{Theano: A {Python} framework for fast computation of mathematical expressions}", author={{Theano Development Team}}, journal={arXiv e-prints}, volume={abs/1605.02688}, year=2016, month=may, url={http://arxiv.org/abs/1605.02688} } @ARTICLE{exoplanet:kipping13, author = {{Kipping}, D.~M.}, title = "{Efficient, uninformative sampling of limb darkening coefficients for two-parameter laws}", journal = {mnras}, year = 2013, month = nov, volume = 435, pages = {2152-2160}, doi = {10.1093/mnras/stt1435}, adsurl = {http://adsabs.harvard.edu/abs/2013MNRAS.435.2152K}, adsnote = {Provided by the SAO/NASA Astrophysics Data System} } @article{exoplanet:astropy13, author = {{Astropy Collaboration} and {Robitaille}, T.~P. and {Tollerud}, E.~J. and {Greenfield}, P. and {Droettboom}, M. and {Bray}, E. and {Aldcroft}, T. and {Davis}, M. and {Ginsburg}, A. and {Price-Whelan}, A.~M. and {Kerzendorf}, W.~E. and {Conley}, A. and {Crighton}, N. and {Barbary}, K. and {Muna}, D. and {Ferguson}, H. and {Grollier}, F. and {Parikh}, M.~M. and {Nair}, P.~H. and {Unther}, H.~M. and {Deil}, C. and {Woillez}, J. and {Conseil}, S. and {Kramer}, R. and {Turner}, J.~E.~H. and {Singer}, L. and {Fox}, R. and {Weaver}, B.~A. and {Zabalza}, V. and {Edwards}, Z.~I. and {Azalee Bostroem}, K. and {Burke}, D.~J. and {Casey}, A.~R. and {Crawford}, S.~M. and {Dencheva}, N. and {Ely}, J. and {Jenness}, T. and {Labrie}, K. and {Lim}, P.~L. and {Pierfederici}, F. and {Pontzen}, A. and {Ptak}, A. and {Refsdal}, B. and {Servillat}, M. and {Streicher}, O.}, title = "{Astropy: A community Python package for astronomy}", journal = {aap}, year = 2013, month = oct, volume = 558, pages = {A33}, doi = {10.1051/0004-6361/201322068}, adsurl = {http://adsabs.harvard.edu/abs/2013A%26A...558A..33A}, adsnote = {Provided by the SAO/NASA Astrophysics Data System} } @article{exoplanet:astropy18, author = {{Astropy Collaboration} and {Price-Whelan}, A.~M. and {Sip{H o}cz}, B.~M. and {G{"u}nther}, H.~M. and {Lim}, P.~L. and {Crawford}, S.~M. and {Conseil}, S. and {Shupe}, D.~L. and {Craig}, M.~W. and {Dencheva}, N. and {Ginsburg}, A. and {VanderPlas}, J.~T. and {Bradley}, L.~D. and {P{'e}rez-Su{'a}rez}, D. and {de Val-Borro}, M. and {Aldcroft}, T.~L. and {Cruz}, K.~L. and {Robitaille}, T.~P. and {Tollerud}, E.~J. and {Ardelean}, C. and {Babej}, T. and {Bach}, Y.~P. and {Bachetti}, M. and {Bakanov}, A.~V. and {Bamford}, S.~P. and {Barentsen}, G. and {Barmby}, P. and {Baumbach}, A. and {Berry}, K.~L. and {Biscani}, F. and {Boquien}, M. and {Bostroem}, K.~A. and {Bouma}, L.~G. and {Brammer}, G.~B. and {Bray}, E.~M. and {Breytenbach}, H. and {Buddelmeijer}, H. and {Burke}, D.~J. and {Calderone}, G. and {Cano Rodr{'{i}}guez}, J.~L. and {Cara}, M. and {Cardoso}, J.~V.~M. and {Cheedella}, S. and {Copin}, Y. and {Corrales}, L. and {Crichton}, D. and {D'Avella}, D. and {Deil}, C. and {Depagne}, {'E}. and {Dietrich}, J.~P. and {Donath}, A. and {Droettboom}, M. and {Earl}, N. and {Erben}, T. and {Fabbro}, S. and {Ferreira}, L.~A. and {Finethy}, T. and {Fox}, R.~T. and {Garrison}, L.~H. and {Gibbons}, S.~L.~J. and {Goldstein}, D.~A. and {Gommers}, R. and {Greco}, J.~P. and {Greenfield}, P. and {Groener}, A.~M. and {Grollier}, F. and {Hagen}, A. and {Hirst}, P. and {Homeier}, D. and {Horton}, A.~J. and {Hosseinzadeh}, G. and {Hu}, L. and {Hunkeler}, J.~S. and {Ivezi{'c}}, {v Z}. and {Jain}, A. and {Jenness}, T. and {Kanarek}, G. and {Kendrew}, S. and {Kern}, N.~S. and {Kerzendorf}, W.~E. and {Khvalko}, A. and {King}, J. and {Kirkby}, D. and {Kulkarni}, A.~M. and {Kumar}, A. and {Lee}, A. and {Lenz}, D. and {Littlefair}, S.~P. and {Ma}, Z. and {Macleod}, D.~M. and {Mastropietro}, M. and {McCully}, C. and {Montagnac}, S. and {Morris}, B.~M. and {Mueller}, M. and {Mumford}, S.~J. and {Muna}, D. and {Murphy}, N.~A. and {Nelson}, S. and {Nguyen}, G.~H. and {Ninan}, J.~P. and {N{"o}the}, M. and {Ogaz}, S. and {Oh}, S. and {Parejko}, J.~K. and {Parley}, N. and {Pascual}, S. and {Patil}, R. and {Patil}, A.~A. and {Plunkett}, A.~L. and {Prochaska}, J.~X. and {Rastogi}, T. and {Reddy Janga}, V. and {Sabater}, J. and {Sakurikar}, P. and {Seifert}, M. and {Sherbert}, L.~E. and {Sherwood-Taylor}, H. and {Shih}, A.~Y. and {Sick}, J. and {Silbiger}, M.~T. and {Singanamalla}, S. and {Singer}, L.~P. and {Sladen}, P.~H. and {Sooley}, K.~A. and {Sornarajah}, S. and {Streicher}, O. and {Teuben}, P. and {Thomas}, S.~W. and {Tremblay}, G.~R. and {Turner}, J.~E.~H. and {Terr{'o}n}, V. and {van Kerkwijk}, M.~H. and {de la Vega}, A. and {Watkins}, L.~L. and {Weaver}, B.~A. and {Whitmore}, J.~B. and {Woillez}, J. and {Zabalza}, V. and {Astropy Contributors}}, title = "{The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package}", journal = {aj}, year = 2018, month = sep, volume = 156, pages = {123}, doi = {10.3847/1538-3881/aabc4f}, adsurl = {http://adsabs.harvard.edu/abs/2018AJ....156..123A}, adsnote = {Provided by the SAO/NASA Astrophysics Data System} } @article{exoplanet:luger18, author = {{Luger}, R. and {Agol}, E. and {Foreman-Mackey}, D. and {Fleming}, D.~P. and {Lustig-Yaeger}, J. and {Deitrick}, R.}, title = "{starry: Analytic Occultation Light Curves}", journal = {aj}, year = 2019, month = feb, volume = 157, pages = {64}, doi = {10.3847/1538-3881/aae8e5}, adsurl = {http://adsabs.harvard.edu/abs/2019AJ....157...64L}, adsnote = {Provided by the SAO/NASA Astrophysics Data System} } @article{exoplanet:agol19, author = {{Agol}, Eric and {Luger}, Rodrigo and {Foreman-Mackey}, Daniel}, title = "{Analytic Planetary Transit Light Curves and Derivatives for Stars with Polynomial Limb Darkening}", journal = {arXiv e-prints}, year = 2019, month = aug, eprint = {1908.03222}, url = {http://arxiv.org/abs/1908.03222} }
exoplanet.orbits.
KeplerianOrbit
(period=None, a=None, t0=None, t_periastron=None, incl=None, b=None, duration=None, ecc=None, omega=None, Omega=None, m_planet=0.0, m_star=None, r_star=None, rho_star=None, m_planet_units=None, rho_star_units=None, model=None, contact_points_kwargs=None, **kwargs)¶A system of bodies on Keplerian orbits around a common central
Given the input parameters, the values of all other parameters will be
computed so a KeplerianOrbit
instance will always have attributes for
each argument. Note that the units of the computed attributes will all be
in the standard units of this class (R_sun
, M_sun
, and days
)
except for rho_star
which will be in g / cm^3
.
There are only specific combinations of input parameters that can be used:
period
or a
must be given. If values are given
for both parameters, then neither m_star
or rho_star
can be
defined because the stellar density implied by each planet will be
computed in rho_star
.incl
and b
can be given.ecc
then omega
must also be given.rho_star
is defined, the radius of the central is
assumed to be 1 * R_sun
. Otherwise, at most two of m_star
,
r_star
, and rho_star
can be defined.t0
(reference transit) or t_periastron
must be given,
but not both.Parameters: |
|
---|
get_planet_position
(t, parallax=None)¶The planets’ positions in the barycentric frame
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_planet_velocity
(t)¶Get the planets’ velocity vectors
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
get_radial_velocity
(t, K=None, output_units=None)¶Get the radial velocity of the star
Note
The convention in exoplanet is that positive z points towards the observer. However, for consistency with radial velocity literature this method returns values where positive radial velocity corresponds to a redshift as expected.
Parameters: |
|
---|---|
Returns: | The reflex radial velocity evaluated at |
get_relative_angles
(t, parallax=None)¶The planets’ relative position to the star in the sky plane, in separation, position angle coordinates.
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The separation (arcseconds) and position angle (radians, measured east of north) of the planet relative to the star. |
get_relative_position
(t, parallax=None)¶The planets’ positions relative to the star in the X,Y,Z frame.
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small. In other words, the reflex motion of the star caused by the other planets is neglected when computing the relative coordinates of one of the planets.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_relative_velocity
(t)¶The planets’ velocity relative to the star
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small.
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
R_sun/day . |
get_star_position
(t, parallax=None)¶The star’s position in the barycentric frame
Note
If there are multiple planets in the system, this will return one column per planet with each planet’s contribution to the motion. The star’s full position can be computed by summing over the last axis.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_star_velocity
(t)¶Get the star’s velocity vector
Note
For a system with multiple planets, this will return one column per planet with the contributions from each planet. The total velocity can be found by summing along the last axis.
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
exoplanet.orbits.
TTVOrbit
(*args, **kwargs)¶A generalization of a Keplerian orbit with transit timing variations
Only one of the arguments ttvs
or transit_times
can be given and
the other will be computed from the one that was provided.
In practice the way this works is that the time axis is shifted to account
for the TTVs before solving for a standard Keplerian orbit. To identify
which times corrorspond to which transits, this will find the closest
labelled transit to the timestamp and adjust the timestamp accordingly.
This means that things will go very badly if the time between
neighboring transits is larger than 2*period
.
Parameters: |
|
---|
get_planet_position
(t, parallax=None)¶The planets’ positions in the barycentric frame
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_planet_velocity
(t)¶Get the planets’ velocity vectors
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
get_radial_velocity
(t, K=None, output_units=None)¶Get the radial velocity of the star
Note
The convention in exoplanet is that positive z points towards the observer. However, for consistency with radial velocity literature this method returns values where positive radial velocity corresponds to a redshift as expected.
Parameters: |
|
---|---|
Returns: | The reflex radial velocity evaluated at |
get_relative_angles
(t, parallax=None)¶The planets’ relative position to the star in the sky plane, in separation, position angle coordinates.
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The separation (arcseconds) and position angle (radians, measured east of north) of the planet relative to the star. |
get_relative_position
(t, parallax=None)¶The planets’ positions relative to the star in the X,Y,Z frame.
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small. In other words, the reflex motion of the star caused by the other planets is neglected when computing the relative coordinates of one of the planets.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_relative_velocity
(t)¶The planets’ velocity relative to the star
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small.
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
R_sun/day . |
get_star_position
(t, parallax=None)¶The star’s position in the barycentric frame
Note
If there are multiple planets in the system, this will return one column per planet with each planet’s contribution to the motion. The star’s full position can be computed by summing over the last axis.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_star_velocity
(t)¶Get the star’s velocity vector
Note
For a system with multiple planets, this will return one column per planet with the contributions from each planet. The total velocity can be found by summing along the last axis.
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
exoplanet.orbits.
ReboundOrbit
(*args, **kwargs)¶An N-body system powered by the rebound integrator
This takes all the same arguments as the KeplerianOrbit
, but
these arguments define the orbital elements at some reference time (given
by the rebound_t
parameter). The positions and velocities of the bodies
are then computed by numerically integrating the gravitational N-body
system.
rebound
-specific parameters can be provided as keyword arguments
prefixed by rebound_
. These will then be applied to the
rebound.Simulation
object as properties. Therefore, if you want to
change the integrator, you could use: rebound_integrator = "whfast"
,
for example. All of these parameters are passed directly through to
rebound
except rebound_t
(the reference time) which is converted
from days to years over two pi (the default time units in rebound
).
Note
exoplanet
and rebound
use different base units, but all
of the unit conversions are handled behind the scenes in this object
so that means that you should mostly use the exoplanet
units when
interacting with this class and you should be very cautious about
setting the rebound_G
argument. One example of a case where you’ll
need to use the rebound
units is when you want to set the
integrator step size using the rebound_dt
parameter.
get_planet_position
(t)¶The planets’ positions in the barycentric frame
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_planet_velocity
(t)¶Get the planets’ velocity vectors
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
get_radial_velocity
(t, output_units=None)¶Get the radial velocity of the star
Note
The convention in exoplanet is that positive z points towards the observer. However, for consistency with radial velocity literature this method returns values where positive radial velocity corresponds to a redshift as expected.
Note
Unlike the KeplerianOrbit
implementation, the
semi-amplitude K
cannot be used with the ReboundOrbit
.
Also, the contributions of each planet are not returned separately;
this will always return a single time series.
Parameters: |
|
---|---|
Returns: | The reflex radial velocity evaluated at |
get_relative_angles
(t, parallax=None)¶The planets’ relative position to the star in the sky plane, in separation, position angle coordinates.
Note
This treats each planet independently and does not take the other planets into account when computing the position of the star. This is fine as long as the planet masses are small.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The separation (arcseconds) and position angle (radians, measured east of north) of the planet relative to the star. |
get_relative_position
(t)¶The planets’ positions relative to the star in the X,Y,Z frame.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_relative_velocity
(t)¶The planets’ velocity relative to the star
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
R_sun/day . |
get_star_position
(t)¶The star’s position in the barycentric frame
Note
Unlike the KeplerianOrbit
, this will not return
the contributions from each planet separately.
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
get_star_velocity
(t)¶Get the star’s velocity vector
Note
Unlike the KeplerianOrbit
, this will not return
the contributions from each planet separately.
Parameters: | t – The times where the velocity should be evaluated. |
---|---|
Returns: | The components of the velocity vector at t in units of
M_sun/day . |
in_transit
(t, r=0.0, texp=None)¶This is a no-op and all points are assumed to be in transit
exoplanet.orbits.
SimpleTransitOrbit
(period=None, t0=0.0, b=0.0, duration=None, r_star=1.0)¶An orbit representing a set of planets transiting a common central
This orbit is parameterized by the observables of a transiting system, period, phase, duration, and impact parameter.
Parameters: |
|
---|
get_relative_position
(t)¶The planets’ positions relative to the star
Parameters: | t – The times where the position should be evaluated. |
---|---|
Returns: | The components of the position vector at t in units of
R_sun . |
exoplanet.
LimbDarkLightCurve
(u, model=None)¶A limb darkened light curve computed using starry
Parameters: | u (vector) – A vector of limb darkening coefficients. |
---|
get_light_curve
(orbit=None, r=None, t=None, texp=None, oversample=7, order=0, use_in_transit=True)¶Get the light curve for an orbit at a set of times
Parameters: |
|
---|
exoplanet.gp.
GP
(kernel, x, diag, J=-1, model=None)¶The interface for computing Gaussian Process models with celerite
This class implements the method described in Foreman-Mackey et al. (2017) and Foreman-Mackey (2018) for scalable evaluation of Gaussian Process (GP) models in 1D.
Note
The input coordinates x
must be sorted in ascending order,
but this is not checked in the code. If the values are not sorted, the
behavior of the algorithm is undefined.
Parameters: |
|
---|
exoplanet.gp.terms.
Term
(**kwargs)¶The abstract base “term” that is the superclass of all other terms
Subclasses should overload the terms.Term.get_real_coefficients()
and terms.Term.get_complex_coefficients()
methods.
exoplanet.gp.terms.
RealTerm
(**kwargs)¶The simplest celerite term
This term has the form
with the parameters a
and c
.
Strictly speaking, for a sum of terms, the parameter a
could be
allowed to go negative but since it is somewhat subtle to ensure positive
definiteness, we recommend keeping both parameters strictly positive.
Advanced users can build a custom term that has negative coefficients but
care should be taken to ensure positivity.
Parameters: |
|
---|
exoplanet.gp.terms.
ComplexTerm
(**kwargs)¶A general celerite term
This term has the form
with the parameters a
, b
, c
, and d
.
This term will only correspond to a positive definite kernel (on its own) if \(a_j\,c_j \ge b_j\,d_j\).
Parameters: |
|
---|
exoplanet.gp.terms.
SHOTerm
(*args, **kwargs)¶A term representing a stochastically-driven, damped harmonic oscillator
The PSD of this term is
with the parameters S0
, Q
, and w0
.
Parameters: |
|
---|
exoplanet.gp.terms.
Matern32Term
(**kwargs)¶A term that approximates a Matern-3/2 function
This term is defined as
with the parameters sigma
and rho
. The parameter eps
controls the quality of the approximation since, in the limit
\(\epsilon \to 0\) this becomes the Matern-3/2 function
Parameters: |
|
---|
exoplanet.gp.terms.
RotationTerm
(**kwargs)¶A mixture of two SHO terms that can be used to model stellar rotation
This term has two modes in Fourier space: one at period
and one at
0.5 * period
. This can be a good descriptive model for a wide range of
stochastic variability in stellar time series from rotation to pulsations.
Parameters: |
|
---|
exoplanet.
estimate_semi_amplitude
(periods, x, y, yerr=None, t0s=None)¶Estimate the RV semi-amplitudes for planets in an RV series
Parameters: |
|
---|---|
Returns: | An estimate of the semi-amplitude of each planet in units of |
exoplanet.
estimate_minimum_mass
(periods, x, y, yerr=None, t0s=None, m_star=1)¶Estimate the minimum mass(es) for planets in an RV series
Parameters: |
|
---|---|
Returns: | An estimate of the minimum mass of each planet as an AstroPy Quantity
with units of |
exoplanet.
lomb_scargle_estimator
(x, y, yerr=None, min_period=None, max_period=None, filter_period=None, max_peaks=2, **kwargs)¶Estimate period of a time series using the periodogram
Parameters: |
|
---|---|
Returns: | A dictionary with the computed |
exoplanet.
autocorr_estimator
(x, y, yerr=None, min_period=None, max_period=None, oversample=2.0, smooth=2.0, max_peaks=10)¶Estimate the period of a time series using the autocorrelation function
Note
The signal is interpolated onto a uniform grid in time so that the autocorrelation function can be computed.
Parameters: |
|
---|---|
Returns: | A dictionary with the computed autocorrelation function and the
estimated period. For compatibility with the
|
exoplanet.distributions.
UnitUniform
(*args, **kwargs)¶A uniform distribution between zero and one
This can sometimes get better performance than pm.Uniform.dist(0, 1)
.
exoplanet.distributions.
UnitVector
(*args, **kwargs)¶A vector where the sum of squares is fixed to unity
For a multidimensional shape, the normalization is performed along the last dimension.
exoplanet.distributions.
Angle
(*args, **kwargs)¶An angle constrained to be in the range -pi to pi
The actual sampling is performed in the two dimensional vector space
(sin(theta), cos(theta))
so that the sampler doesn’t see a
discontinuity at pi.
exoplanet.distributions.
Periodic
(lower=0, upper=1, **kwargs)¶An periodic parameter in a given range
Like the Angle
distribution, the actual sampling is performed in
a two dimensional vector space (sin(theta), cos(theta))
and then
transformed into the range [lower, upper)
.
Parameters: |
|
---|
exoplanet.distributions.
QuadLimbDark
(*args, **kwargs)¶An uninformative prior for quadratic limb darkening parameters
This is an implementation of the Kipping (2013) reparameterization of the two-parameter limb darkening model to allow for efficient and uninformative sampling.
exoplanet.distributions.
ImpactParameter
(ror=None, **kwargs)¶The impact parameter distribution for a transiting planet
Parameters: | ror – A scalar, tensor, or PyMC3 distribution representing the radius
ratio between the planet and star. Conditioned on a value of
ror , this will be uniformly distributed between 0 and
1+ror . |
---|
exoplanet.distributions.eccentricity.
kipping13
(name, fixed=False, long=None, model=None, **kwargs)¶The beta eccentricity distribution fit by Kipping (2013)
The beta distribution parameters fit by Kipping (2013).
Parameters: |
|
---|---|
Returns: | The eccentricity distribution. |
exoplanet.distributions.eccentricity.
vaneylen19
(name, fixed=False, multi=False, model=None, **kwargs)¶The eccentricity distribution for small planets
The mixture distribution fit by Van Eylen et al. (2019) to a population of well-characterized small transiting planets observed by Kepler.
Parameters: |
|
---|---|
Returns: | The eccentricity distribution. |
exoplanet.
optimize
(start=None, vars=None, model=None, return_info=False, verbose=True, **kwargs)¶Maximize the log prob of a PyMC3 model using scipy
All extra arguments are passed directly to the scipy.optimize.minimize
function.
Parameters: |
|
---|
exoplanet.
eval_in_model
(var, point=None, return_func=False, model=None, **kwargs)¶Evaluate a Theano tensor or PyMC3 variable in a PyMC3 model
This method builds a Theano function for evaluating a node in the graph
given the required parameters. This will also cache the compiled Theano
function in the current pymc3.Model
to reduce the overhead of calling
this function many times.
Parameters: |
|
---|---|
Returns: | Depending on |
exoplanet.
get_samples_from_trace
(trace, size=1)¶Generate random samples from a PyMC3 MultiTrace
Parameters: |
|
---|
exoplanet.
get_dense_nuts_step
(start=None, adaptation_window=101, doubling=True, model=None, **kwargs)¶Get a NUTS step function with a dense mass matrix
The entries in the mass matrix will be tuned based on the sample
covariances during tuning. All extra arguments are passed directly to
pymc3.NUTS
.
Parameters: |
|
---|
exoplanet.orbits.ttv.
compute_expected_transit_times
(min_time, max_time, period, t0)¶Compute the expected transit times within a dataset
Parameters: | |
---|---|
Returns: | A list of arrays of expected transit times for each planet |
exoplanet.units.
with_unit
(obj, unit)¶Decorate a Theano tensor with Astropy units
Parameters: |
|
---|---|
Raises: |
|
exoplanet.units.
has_unit
(obj)¶Does an object have units as defined by exoplanet?
exoplanet.units.
to_unit
(obj, target)¶Convert a Theano tensor with units to a target set of units
Parameters: |
|
---|---|
Returns: | A Theano tensor in the right units |
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Hamiltonian Monte Carlo (HMC) methods haven’t been widely used in astrophysics, but they are the standard methods for probabilistic inference using Markov chain Monte Carlo (MCMC) in many other fields. exoplanet is designed to provide the building blocks for fitting many exoplanet datasets using this technology, and this tutorial presents some of the basic features of the PyMC3 modeling language and inference engine. The documentation for PyMC3 includes many other tutorials that you should check out to get more familiar with the features that are available.
In this tutorial, we will go through two simple examples of fitting some data using PyMC3. The first is the classic fitting a line to data with unknown error bars, and the second is a more relevant example where we fit a radial velocity model to the public radial velocity observations of 51 Peg. You can read more about fitting lines to data in the bible of line fitting and you can see another example of fitting the 51 Peg data using HMC (this time using Stan) here.
My standard intro to a new modeling language or inference framework is to fit a line to data. So. Let’s do that with PyMC3.
To start, we’ll generate some fake data using a linear model. Feel free to change the random number seed to try out a different dataset.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
true_m = 0.5
true_b = -1.3
true_logs = np.log(0.3)
x = np.sort(np.random.uniform(0, 5, 50))
y = true_b + true_m * x + np.exp(true_logs) * np.random.randn(len(x))
plt.plot(x, y, ".k")
plt.ylim(-2, 2)
plt.xlabel("x")
plt.ylabel("y");
To fit a model to these data, our model will have 3 parameters: the slope \(m\), the intercept \(b\), and the log of the uncertainty \(\log(\sigma)\). To start, let’s choose broad uniform priors on these parameters:
Then, the log-likelihood function will be
[Note: the second normalization term is needed in this model because we are fitting for \(\sigma\) and the second term is not a constant.]
Another way of writing this model that might not be familiar is the following:
This is the way that a model like this is often defined in statistics and it will be useful when we implement out model in PyMC3 so take a moment to make sure that you understand the notation.
Now, let’s implement this model in PyMC3. The documentation for the distributions available in PyMC3’s modeling language can be found here and these will come in handy as you go on to write your own models.
import pymc3 as pm
with pm.Model() as model:
# Define the priors on each parameter:
m = pm.Uniform("m", lower=-5, upper=5)
b = pm.Uniform("b", lower=-5, upper=5)
logs = pm.Uniform("logs", lower=-5, upper=5)
# Define the likelihood. A few comments:
# 1. For mathematical operations like "exp", you can't use
# numpy. Instead, use the mathematical operations defined
# in "pm.math".
# 2. To condition on data, you use the "observed" keyword
# argument to any distribution. In this case, we want to
# use the "Normal" distribution (look up the docs for
# this).
pm.Normal("obs", mu=m * x + b, sd=pm.math.exp(logs), observed=y)
# This is how you will sample the model. Take a look at the
# docs to see that other parameters that are available.
trace = pm.sample(draws=1000, tune=1000, chains=2)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [logs, b, m]
Sampling 2 chains: 100%|██████████| 4000/4000 [00:01<00:00, 2048.29draws/s]
Now since we now have samples, let’s make some diagnostic plots. The first plot to look at is the “traceplot” implemented in PyMC3. In this plot, you’ll see the marginalized distribution for each parameter on the left and the trace plot (parameter value as a function of step number) on the right. In each panel, you should see two lines with different colors. These are the results of different independent chains and if the results are substantially different in the different chains then there is probably something going wrong.
pm.traceplot(trace, varnames=["m", "b", "logs"]);
/mnt/home/dforeman/miniconda3/envs/autoexoplanet/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument varnames renamed to var_names, and will be removed in pymc3 3.8 warnings.warn('Keyword argument {old} renamed to {new}, and will be removed in pymc3 3.8'.format(old=old, new=new))
It’s also good to quantify that “looking substantially different”
argument. This is implemented in PyMC3 as the “summary” function. In
this table, some of the key columns to look at are n_eff
and
Rhat
. * n_eff
shows an estimate of the number of effective (or
independent) samples for that parameter. In this case, n_eff
should
probably be around 500 per chain (there should have been 2 chains run).
* Rhat
shows the Gelman–Rubin
statistic
and it should be close to 1.
pm.summary(trace, varnames=["m", "b", "logs"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
m | 0.509379 | 0.026840 | 0.000868 | 0.453696 | 0.562176 | 928.294846 | 1.001888 |
b | -1.321782 | 0.071442 | 0.002324 | -1.450008 | -1.169206 | 1024.597914 | 1.000324 |
logs | -1.271705 | 0.105530 | 0.002755 | -1.474413 | -1.067339 | 1157.270457 | 0.999515 |
The last diagnostic plot that we’ll make here is the corner plot made
using corner.py. The easiest way to
do this using PyMC3 is to first convert the trace to a Pandas
DataFrame and then pass that to
corner.py
.
import corner # https://corner.readthedocs.io
samples = pm.trace_to_dataframe(trace, varnames=["m", "b", "logs"])
corner.corner(samples, truths=[true_m, true_b, true_logs]);
Extra credit: Here are a few suggestions for things to try out while getting more familiar with PyMC3:
testval
argument to the
distributions. Does this improve performance in this case? It will
substantially improve performance in more complicated examples.While the above example was cute, it doesn’t really fully exploit the power of PyMC3 and it doesn’t really show some of the real issues that you will face when you use PyMC3 as an astronomer. To get a better sense of how you might use PyMC3 in Real Life™, let’s take a look at a more realistic example: fitting a Keplerian orbit to radial velocity observations.
One of the key aspects of this problem that I want to highlight is the fact that PyMC3 (and the underlying model building framework Theano) don’t have out-of-the-box support for the root-finding that is required to solve Kepler’s equation. As part of the process of computing a Keplerian RV model, we must solve the equation:
for the eccentric anomaly \(E\) given some mean anomaly \(M\) and eccentricity \(e\). There are commonly accepted methods of solving this equation using Newton’s method, but if we want to expose that to PyMC3, we have to define a custom Theano operation with a custom gradient. I won’t go into the details of the math (because I blogged about it) and I won’t go into the details of the implementation (because you can take a look at it on GitHub). So, for this tutorial, we’ll use the custom Kepler solver that is implemented as part of exoplanet and fit the publicly available radial velocity observations of the famous exoplanetary system 51 Peg using PyMC3.
First, we need to download the data from the exoplanet archive:
import requests
import pandas as pd
import matplotlib.pyplot as plt
# Download the dataset from the Exoplanet Archive:
url = "https://exoplanetarchive.ipac.caltech.edu/data/ExoData/0113/0113357/data/UID_0113357_RVC_001.tbl"
r = requests.get(url)
if r.status_code != requests.codes.ok:
r.raise_for_status()
data = np.array(
[
l.split()
for l in r.text.splitlines()
if not l.startswith("\\") and not l.startswith("|")
],
dtype=float,
)
t, rv, rv_err = data.T
t -= np.mean(t)
# Plot the observations "folded" on the published period:
# Butler et al. (2006) https://arxiv.org/abs/astro-ph/0607493
lit_period = 4.230785
plt.errorbar((t % lit_period) / lit_period, rv, yerr=rv_err, fmt=".k", capsize=0)
plt.xlim(0, 1)
plt.ylim(-110, 110)
plt.annotate(
"period = {0:.6f} days".format(lit_period),
xy=(1, 0),
xycoords="axes fraction",
xytext=(-5, 5),
textcoords="offset points",
ha="right",
va="bottom",
fontsize=12,
)
plt.ylabel("radial velocity [m/s]")
plt.xlabel("phase");
Now, here’s the implementation of a radial velocity model in PyMC3. Some of this will look familiar after the Hello World example, but things are a bit more complicated now. Take a minute to take a look through this and see if you can follow it. There’s a lot going on, so I want to point out a few things to pay attention to:
exp
and sqrt
)
are being performed using Theano instead of NumPy.Deterministic
distributions.
This can be useful because it allows us to track values as the chain
progresses even if they’re not parameters. For example, after
sampling, we will have a sample for bkg
(the background RV trend)
for each step in the chain. This can be especially useful for making
plots of the results.w
in the model below), it can be inefficient to sample in the angle
directly because of the fact that the value wraps around at
\(2\pi\). Instead, it can be better to sample the unit vector
specified by the angle. In practice, this can be achieved by sampling
a 2-vector from an isotropic Gaussian and normalizing the components
by the norm. This is implemented as part of exoplanet in the
exoplanet.distributions.Angle
class.import theano.tensor as tt
from exoplanet.orbits import get_true_anomaly
from exoplanet.distributions import Angle
with pm.Model() as model:
# Parameters
logK = pm.Uniform(
"logK",
lower=0,
upper=np.log(200),
testval=np.log(0.5 * (np.max(rv) - np.min(rv))),
)
logP = pm.Uniform("logP", lower=0, upper=np.log(10), testval=np.log(lit_period))
phi = pm.Uniform("phi", lower=0, upper=2 * np.pi, testval=0.1)
e = pm.Uniform("e", lower=0, upper=1, testval=0.1)
w = Angle("w")
logjitter = pm.Uniform(
"logjitter", lower=-10, upper=5, testval=np.log(np.mean(rv_err))
)
rv0 = pm.Normal("rv0", mu=0.0, sd=10.0, testval=0.0)
rvtrend = pm.Normal("rvtrend", mu=0.0, sd=10.0, testval=0.0)
# Deterministic transformations
n = 2 * np.pi * tt.exp(-logP)
P = pm.Deterministic("P", tt.exp(logP))
K = pm.Deterministic("K", tt.exp(logK))
cosw = tt.cos(w)
sinw = tt.sin(w)
s2 = tt.exp(2 * logjitter)
t0 = (phi + w) / n
# The RV model
bkg = pm.Deterministic("bkg", rv0 + rvtrend * t / 365.25)
M = n * t - (phi + w)
# This is the line that uses the custom Kepler solver
f = get_true_anomaly(M, e + tt.zeros_like(M))
rvmodel = pm.Deterministic(
"rvmodel", bkg + K * (cosw * (tt.cos(f) + e) - sinw * tt.sin(f))
)
# Condition on the observations
err = tt.sqrt(rv_err ** 2 + tt.exp(2 * logjitter))
pm.Normal("obs", mu=rvmodel, sd=err, observed=rv)
# Compute the phased RV signal
phase = np.linspace(0, 1, 500)
M_pred = 2 * np.pi * phase - (phi + w)
f_pred = get_true_anomaly(M_pred, e + tt.zeros_like(M_pred))
rvphase = pm.Deterministic(
"rvphase", K * (cosw * (tt.cos(f_pred) + e) - sinw * tt.sin(f_pred))
)
In this case, I’ve found that it is useful to first optimize the
parameters to find the “maximum a posteriori” (MAP) parameters and then
start the sampler from there. This is useful here because MCMC is not
designed to find the maximum of the posterior; it’s just meant to
sample the shape of the posterior. The performance of all MCMC methods
can be really bad when the initialization isn’t good (especially when
some parameters are very well constrained). To find the maximum a
posteriori parameters using PyMC3, you can use the
exoplanet.optimize()
function:
from exoplanet import optimize
with model:
map_params = optimize()
optimizing logp for variables: [rvtrend, rv0, logjitter, w, e, phi, logP, logK]
70it [00:05, 12.53it/s, logp=-8.351913e+02]
message: Desired error not necessarily achieved due to precision loss.
logp: -1325.8433398976674 -> -835.1912653583977
Let’s make a plot to check that this initialization looks reasonable. In the top plot, we’re looking at the RV observations as a function of time with the initial guess for the long-term trend overplotted in blue. In the lower panel, we plot the “folded” curve where we have wrapped the observations onto the best-fit period and the prediction for a single overplotted in orange. If this doesn’t look good, try adjusting the initial guesses for the parameters and see if you can get a better fit.
Exercise: Try changing the initial guesses for the parameters (as
specified by the testval
argument) and see how sensitive the results
are to these values. Are there some parameters that are less important?
Why is this?
fig, axes = plt.subplots(2, 1, figsize=(8, 8))
period = map_params["P"]
ax = axes[0]
ax.errorbar(t, rv, yerr=rv_err, fmt=".k")
ax.plot(t, map_params["bkg"], color="C0", lw=1)
ax.set_ylim(-110, 110)
ax.set_ylabel("radial velocity [m/s]")
ax.set_xlabel("time [days]")
ax = axes[1]
ax.errorbar(t % period, rv - map_params["bkg"], yerr=rv_err, fmt=".k")
ax.plot(phase * period, map_params["rvphase"], color="C1", lw=1)
ax.set_ylim(-110, 110)
ax.set_ylabel("radial velocity [m/s]")
ax.set_xlabel("phase [days]")
plt.tight_layout()
Now let’s sample the posterior starting from our MAP estimate.
with model:
trace = pm.sample(draws=2000, tune=1000, start=map_params, chains=2)
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (2 chains in 4 jobs) NUTS: [rvtrend, rv0, logjitter, w, e, phi, logP, logK] Sampling 2 chains: 100%|██████████| 6000/6000 [00:28<00:00, 156.19draws/s] There were 2 divergences after tuning. Increase target_accept or reparameterize. There were 56 divergences after tuning. Increase target_accept or reparameterize.
As above, it’s always a good idea to take a look at the summary statistics for the chain. If everything went as planned, there should be more than 1000 effective samples per chain and the Rhat values should be close to 1. (Not too bad for less than 30 seconds of run time!)
pm.summary(
trace, varnames=["logK", "logP", "phi", "e", "w", "logjitter", "rv0", "rvtrend"]
)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
logK | 4.019058 | 0.009742 | 1.684927e-04 | 3.999675 | 4.037873 | 3257.479702 | 0.999763 |
logP | 1.442387 | 0.000009 | 1.536824e-07 | 1.442370 | 1.442404 | 2962.140497 | 1.000232 |
phi | 0.397220 | 0.009465 | 1.574170e-04 | 0.378403 | 0.415974 | 3057.642931 | 0.999770 |
e | 0.010475 | 0.007458 | 1.876348e-04 | 0.000004 | 0.024664 | 1611.628553 | 0.999910 |
w | 0.606659 | 1.322090 | 3.600565e-02 | -2.315069 | 3.028083 | 1227.340740 | 0.999750 |
logjitter | -4.764163 | 3.110690 | 8.357780e-02 | -9.917167 | 0.250216 | 1299.186982 | 0.999751 |
rv0 | -1.822466 | 0.385066 | 6.534831e-03 | -2.568321 | -1.065913 | 3289.572574 | 0.999871 |
rvtrend | -1.591570 | 0.187884 | 3.758002e-03 | -1.958871 | -1.231343 | 2674.003895 | 0.999760 |
Similarly, we can make the corner plot again for this model.
samples = pm.trace_to_dataframe(trace, varnames=["K", "P", "e", "w"])
corner.corner(samples);
Finally, the last plot that we’ll make here is of the posterior predictive density. In this case, this means that we want to look at the distribution of predicted models that are consistent with the data. As above, the top plot shows the raw observations as black error bars and the RV trend model is overplotted in blue. But, this time, the blue line is actually composed of 25 lines that are samples from the posterior over trends that are consistent with the data. In the bottom panel, the orange lines indicate the same 25 posterior samples for the RV curve of one orbit.
fig, axes = plt.subplots(2, 1, figsize=(8, 8))
period = map_params["P"]
ax = axes[0]
ax.errorbar(t, rv, yerr=rv_err, fmt=".k")
ax.set_ylabel("radial velocity [m/s]")
ax.set_xlabel("time [days]")
ax = axes[1]
ax.errorbar(t % period, rv - map_params["bkg"], yerr=rv_err, fmt=".k")
ax.set_ylabel("radial velocity [m/s]")
ax.set_xlabel("phase [days]")
for i in np.random.randint(len(trace) * trace.nchains, size=25):
axes[0].plot(t, trace["bkg"][i], color="C0", lw=1, alpha=0.3)
axes[1].plot(phase * period, trace["rvphase"][i], color="C1", lw=1, alpha=0.3)
axes[0].set_ylim(-110, 110)
axes[1].set_ylim(-110, 110)
plt.tight_layout()
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
exoplanet comes bundled with a few utilities that can make it easier to use and debug PyMC3 models for fitting exoplanet data. This tutorial briefly describes these features and their use.
The main extra is the exoplanet.get_dense_nuts_step()
function
that extends the PyMC3 sampling procedure to include support for
learning off-diagonal elements of the mass matrix. This is very
important for any problems where there are covariances between the
parameters (this is true for pretty much all exoplanet models). A
thorough discussion of this can be found elsewhere
online, but here is a
simple demo where we sample a covariant Gaussian using
exoplanet.get_dense_nuts_step()
.
First, we generate a random positive definite covariance matrix for the Gaussian:
import numpy as np
ndim = 5
np.random.seed(42)
L = np.random.randn(ndim, ndim)
L[np.diag_indices_from(L)] = 0.1 * np.exp(L[np.diag_indices_from(L)])
L[np.triu_indices_from(L, 1)] = 0.0
cov = np.dot(L, L.T)
And then we can sample this using PyMC3 and
exoplanet.get_dense_nuts_step()
:
import pymc3 as pm
import exoplanet as xo
with pm.Model() as model:
pm.MvNormal("x", mu=np.zeros(ndim), chol=L, shape=(ndim,))
trace = pm.sample(tune=2000, draws=2000, chains=4, step=xo.get_dense_nuts_step())
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [x]
Sampling 4 chains: 100%|██████████| 16000/16000 [00:07<00:00, 2115.59draws/s]
This is a little more verbose than the standard use of PyMC3, but the
performance is several orders of magnitude better than you would get
without the mass matrix tuning. As you can see from the
pymc3.summary
, the autocorrelation time of this chain is about 1 as
we would expect for a simple problem like this.
pm.summary(trace)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
x__0 | -0.000156 | 0.161445 | 0.001604 | -0.321006 | 0.304166 | 13571.845512 | 0.999959 |
x__1 | 0.000685 | 0.530898 | 0.004554 | -1.052721 | 1.022327 | 13769.293014 | 0.999780 |
x__2 | -0.002025 | 0.650792 | 0.006395 | -1.295745 | 1.261359 | 11679.233640 | 1.000058 |
x__3 | -0.005420 | 1.169616 | 0.010965 | -2.320365 | 2.281117 | 12123.977550 | 0.999993 |
x__4 | -0.002802 | 2.040530 | 0.019482 | -3.925355 | 3.923949 | 12574.022750 | 0.999917 |
I find that when I’m debugging a PyMC3 model, I often want to inspect
the value of some part of the model for a given set of parameters. As
far as I can tell, there isn’t a simple way to do this in PyMC3, so
exoplanet comes with a hack for doing this:
exoplanet.eval_in_model()
. This function handles the mapping
between named PyMC3 variables and the input required by the Theano
function that can evaluate the requested variable or tensor.
As a demo, let’s say that we’re fitting a parabola to some data:
np.random.seed(42)
x = np.sort(np.random.uniform(-1, 1, 50))
with pm.Model() as model:
logs = pm.Normal("logs", mu=-3.0, sd=1.0)
a0 = pm.Normal("a0")
a1 = pm.Normal("a1")
a2 = pm.Normal("a2")
mod = a0 + a1 * x + a2 * x ** 2
# Sample from the prior
prior_sample = pm.sample_prior_predictive(samples=1)
y = xo.eval_in_model(mod, prior_sample)
y += np.exp(prior_sample["logs"]) * np.random.randn(len(y))
# Add the likelihood
pm.Normal("obs", mu=mod, sd=pm.math.exp(logs), observed=y)
# Fit the data
map_soln = xo.optimize()
trace = pm.sample()
optimizing logp for variables: [a2, a1, a0, logs]
26it [00:00, 292.51it/s, logp=4.272312e+01]
message: Optimization terminated successfully.
logp: -180.51036900636572 -> 42.72311721910288
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [a2, a1, a0, logs]
Sampling 4 chains: 100%|██████████| 4000/4000 [00:01<00:00, 3206.57draws/s]
The acceptance probability does not match the target. It is 0.8788002200800235, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.8834353310005734, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.8827839735098203, but should be close to 0.8. Try to increase the number of tuning steps.
After running the fit, it might be interesting to look at the
predictions of the model. We could have added a pymc3.Deterministic
node for eveything, but that can end up taking up a lot of memory and
sometimes its useful to be able to experiement with different outputs.
Using exoplanet.utils.eval_in_model()
we can, for example,
evaluate the maximum a posteriori (MAP) model prediction on a fine grid:
import matplotlib.pyplot as plt
x_grid = np.linspace(-1.1, 1.1, 5000)
with model:
pred = xo.eval_in_model(a0 + a1 * x_grid + a2 * x_grid ** 2, map_soln)
plt.plot(x, y, ".k", label="data")
plt.plot(x_grid, pred, label="map")
plt.legend(fontsize=12)
plt.xlabel("x")
plt.ylabel("y")
plt.xlim(-1.1, 1.1);
We can also combine this with exoplanet.get_samples_from_trace()
to plot this prediction for a set of samples in the trace.
samples = np.empty((50, len(x_grid)))
with model:
y_grid = a0 + a1 * x_grid + a2 * x_grid ** 2
for i, sample in enumerate(xo.get_samples_from_trace(trace, size=50)):
samples[i] = xo.eval_in_model(y_grid, sample)
plt.plot(x, y, ".k", label="data")
plt.plot(x_grid, pred, label="map")
plt.plot(x_grid, samples[0], color="C1", alpha=0.1, label="posterior")
plt.plot(x_grid, samples[1:].T, color="C1", alpha=0.1)
plt.legend(fontsize=12)
plt.xlabel("x")
plt.ylabel("y")
plt.xlim(-1.1, 1.1);
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
In this tutorial, we will demonstrate how to fit radial velocity observations of an exoplanetary system using exoplanet. We will follow the getting started tutorial from the exellent RadVel package where they fit for the parameters of the two planets in the K2-24 system.
First, let’s download the data from RadVel:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
url = "https://raw.githubusercontent.com/California-Planet-Search/radvel/master/example_data/epic203771098.csv"
data = pd.read_csv(url, index_col=0)
x = np.array(data.t)
y = np.array(data.vel)
yerr = np.array(data.errvel)
plt.errorbar(x, y, yerr=yerr, fmt=".k")
plt.xlabel("time [days]")
plt.ylabel("radial velocity [m/s]");
Now, we know the periods and transit times for the planets from the K2
light curve, so let’s start by
using the exoplanet.estimate_semi_amplitude()
function to
estimate the expected RV semi-amplitudes for the planets.
import exoplanet as xo
periods = [20.8851, 42.3633]
period_errs = [0.0003, 0.0006]
t0s = [2072.7948, 2082.6251]
t0_errs = [0.0007, 0.0004]
Ks = xo.estimate_semi_amplitude(periods, x, y, yerr, t0s=t0s)
print(Ks, "m/s")
[5.05069163 5.50983542] m/s
Now that we have the data and an estimate of the initial values for the parameters, let’s start defining the probabilistic model in PyMC3 (take a look at A quick intro to PyMC3 for exoplaneteers if you’re new to PyMC3). First, we’ll define our priors on the parameters:
import pymc3 as pm
import theano.tensor as tt
with pm.Model() as model:
# Gaussian priors based on transit data (from Petigura et al.)
t0 = pm.Normal("t0", mu=np.array(t0s), sd=np.array(t0_errs), shape=2)
P = pm.Bound(pm.Normal, lower=0)(
"P",
mu=np.array(periods),
sd=np.array(period_errs),
shape=2,
testval=np.array(periods),
)
# Wide log-normal prior for semi-amplitude
logK = pm.Bound(pm.Normal, lower=0)(
"logK", mu=np.log(Ks), sd=10.0, shape=2, testval=np.log(Ks)
)
# Eccentricity & argument of periasteron
ecc = xo.distributions.UnitUniform("ecc", shape=2, testval=np.array([0.1, 0.1]))
omega = xo.distributions.Angle("omega", shape=2)
# Jitter & a quadratic RV trend
logs = pm.Normal("logs", mu=np.log(np.median(yerr)), sd=5.0)
trend = pm.Normal("trend", mu=0, sd=10.0 ** -np.arange(3)[::-1], shape=3)
Now we’ll define the orbit model:
with model:
# Set up the orbit
orbit = xo.orbits.KeplerianOrbit(period=P, t0=t0, ecc=ecc, omega=omega)
# Set up the RV model and save it as a deterministic
# for plotting purposes later
vrad = orbit.get_radial_velocity(x, K=tt.exp(logK))
pm.Deterministic("vrad", vrad)
# Define the background model
A = np.vander(x - 0.5 * (x.min() + x.max()), 3)
bkg = pm.Deterministic("bkg", tt.dot(A, trend))
# Sum over planets and add the background to get the full model
rv_model = pm.Deterministic("rv_model", tt.sum(vrad, axis=-1) + bkg)
For plotting purposes, it can be useful to also save the model on a fine grid in time.
t = np.linspace(x.min() - 5, x.max() + 5, 1000)
with model:
vrad_pred = orbit.get_radial_velocity(t, K=tt.exp(logK))
pm.Deterministic("vrad_pred", vrad_pred)
A_pred = np.vander(t - 0.5 * (x.min() + x.max()), 3)
bkg_pred = pm.Deterministic("bkg_pred", tt.dot(A_pred, trend))
rv_model_pred = pm.Deterministic(
"rv_model_pred", tt.sum(vrad_pred, axis=-1) + bkg_pred
)
Now, we can plot the initial model:
plt.errorbar(x, y, yerr=yerr, fmt=".k")
with model:
plt.plot(t, xo.eval_in_model(vrad_pred), "--k", alpha=0.5)
plt.plot(t, xo.eval_in_model(bkg_pred), ":k", alpha=0.5)
plt.plot(t, xo.eval_in_model(rv_model_pred), label="model")
plt.legend(fontsize=10)
plt.xlim(t.min(), t.max())
plt.xlabel("time [days]")
plt.ylabel("radial velocity [m/s]")
plt.title("initial model");
In this plot, the background is the dotted line, the individual planets are the dashed lines, and the full model is the blue line.
It doesn’t look amazing so let’s add in the likelihood and fit for the maximum a posterior parameters.
with model:
err = tt.sqrt(yerr ** 2 + tt.exp(2 * logs))
pm.Normal("obs", mu=rv_model, sd=err, observed=y)
map_soln = xo.optimize(start=model.test_point, vars=[trend])
map_soln = xo.optimize(start=map_soln)
optimizing logp for variables: [trend]
13it [00:04, 3.01it/s, logp=-6.484820e+01]
message: Optimization terminated successfully.
logp: -79.73266285618838 -> -64.8482026233154
optimizing logp for variables: [trend, logs, omega, ecc, logK, P, t0]
94it [00:00, 180.82it/s, logp=-1.427676e+01]
message: Optimization terminated successfully.
logp: -64.8482026233154 -> -14.276760262380932
plt.errorbar(x, y, yerr=yerr, fmt=".k")
plt.plot(t, map_soln["vrad_pred"], "--k", alpha=0.5)
plt.plot(t, map_soln["bkg_pred"], ":k", alpha=0.5)
plt.plot(t, map_soln["rv_model_pred"], label="model")
plt.legend(fontsize=10)
plt.xlim(t.min(), t.max())
plt.xlabel("time [days]")
plt.ylabel("radial velocity [m/s]")
plt.title("MAP model");
That looks better.
Now that we have our model set up and a good estimate of the initial
parameters, let’s start sampling. There are substantial covariances
between some of the parameters so we’ll use a
exoplanet.get_dense_nuts_step()
to tune the sampler (see the
PyMC3 extras tutorial for more information).
np.random.seed(42)
with model:
trace = pm.sample(
tune=4000, draws=4000, step=xo.get_dense_nuts_step(target_accept=0.95)
)
Multiprocess sampling (4 chains in 4 jobs) NUTS: [trend, logs, omega, ecc, logK, P, t0] Sampling 4 chains: 100%|██████████| 32000/32000 [01:38<00:00, 324.88draws/s] There was 1 divergence after tuning. Increase target_accept or reparameterize. There were 5 divergences after tuning. Increase target_accept or reparameterize. There were 4 divergences after tuning. Increase target_accept or reparameterize. There were 6 divergences after tuning. Increase target_accept or reparameterize. The number of effective samples is smaller than 25% for some parameters.
After sampling, it’s always a good idea to do some convergence checks. First, let’s check the number of effective samples and the Gelman-Rubin statistic for our parameters of interest:
pm.summary(trace, varnames=["trend", "logs", "omega", "ecc", "t0", "logK", "P"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
trend__0 | 0.000959 | 0.000767 | 0.000008 | -0.000533 | 0.002476 | 8822.213670 | 0.999924 |
trend__1 | -0.038355 | 0.022526 | 0.000198 | -0.082882 | 0.005619 | 12375.503833 | 0.999947 |
trend__2 | -1.969807 | 0.798796 | 0.007218 | -3.486318 | -0.396062 | 10515.726879 | 1.000064 |
logs | 1.035119 | 0.223164 | 0.002460 | 0.595741 | 1.471820 | 7444.652535 | 1.000065 |
omega__0 | -0.295243 | 0.770839 | 0.013249 | -1.457638 | 1.357533 | 3367.327667 | 1.001333 |
omega__1 | -0.626204 | 2.109072 | 0.028147 | -3.138377 | 2.923905 | 5452.408556 | 1.000018 |
ecc__0 | 0.238510 | 0.111980 | 0.001492 | 0.010760 | 0.440057 | 5346.847668 | 1.000452 |
ecc__1 | 0.199042 | 0.152331 | 0.002676 | 0.000026 | 0.499889 | 3250.981569 | 1.000073 |
t0__0 | 2072.794800 | 0.000704 | 0.000006 | 2072.793387 | 2072.796154 | 16908.246398 | 0.999888 |
t0__1 | 2082.625099 | 0.000396 | 0.000003 | 2082.624299 | 2082.625864 | 19029.275910 | 0.999969 |
logK__0 | 1.555357 | 0.254439 | 0.004343 | 1.022535 | 1.985683 | 3303.559263 | 0.999944 |
logK__1 | 1.573699 | 0.232897 | 0.002672 | 1.110036 | 2.019354 | 6864.314828 | 1.000101 |
P__0 | 20.885096 | 0.000299 | 0.000003 | 20.884487 | 20.885659 | 11727.407960 | 1.000444 |
P__1 | 42.363304 | 0.000606 | 0.000004 | 42.362107 | 42.364480 | 17521.191304 | 0.999982 |
It looks like everything is pretty much converged here. Not bad for 14 parameters and about a minute of runtime…
Then we can make a corner plot of any combination of the parameters. For example, let’s look at period, semi-amplitude, and eccentricity:
import corner
samples = pm.trace_to_dataframe(trace, varnames=["P", "logK", "ecc", "omega"])
corner.corner(samples);
Finally, let’s plot the plosterior constraints on the RV model and compare those to the data:
plt.errorbar(x, y, yerr=yerr, fmt=".k")
# Compute the posterior predictions for the RV model
pred = np.percentile(trace["rv_model_pred"], [16, 50, 84], axis=0)
plt.plot(t, pred[1], color="C0", label="model")
art = plt.fill_between(t, pred[0], pred[2], color="C0", alpha=0.3)
art.set_edgecolor("none")
plt.legend(fontsize=10)
plt.xlim(t.min(), t.max())
plt.xlabel("time [days]")
plt.ylabel("radial velocity [m/s]")
plt.title("posterior constraints");
It might be also be interesting to look at the phased plots for this system. Here we’ll fold the dataset on the median of posterior period and then overplot the posterior constraint on the folded model orbits.
for n, letter in enumerate("bc"):
plt.figure()
# Get the posterior median orbital parameters
p = np.median(trace["P"][:, n])
t0 = np.median(trace["t0"][:, n])
# Compute the median of posterior estimate of the background RV
# and the contribution from the other planet. Then we can remove
# this from the data to plot just the planet we care about.
other = np.median(trace["vrad"][:, :, (n + 1) % 2], axis=0)
other += np.median(trace["bkg"], axis=0)
# Plot the folded data
x_fold = (x - t0 + 0.5 * p) % p - 0.5 * p
plt.errorbar(x_fold, y - other, yerr=yerr, fmt=".k")
# Compute the posterior prediction for the folded RV model for this
# planet
t_fold = (t - t0 + 0.5 * p) % p - 0.5 * p
inds = np.argsort(t_fold)
pred = np.percentile(trace["vrad_pred"][:, inds, n], [16, 50, 84], axis=0)
plt.plot(t_fold[inds], pred[1], color="C0", label="model")
art = plt.fill_between(t_fold[inds], pred[0], pred[2], color="C0", alpha=0.3)
art.set_edgecolor("none")
plt.legend(fontsize=10)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("phase [days]")
plt.ylabel("radial velocity [m/s]")
plt.title("K2-24{0}".format(letter));
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
exoplanet includes methods for computing the light curves transiting planets. In its simplest form this can be used to evaluate a light curve like you would do with batman, for example:
import numpy as np
import matplotlib.pyplot as plt
import exoplanet as xo
# The light curve calculation requires an orbit
orbit = xo.orbits.KeplerianOrbit(period=3.456)
# Compute a limb-darkened light curve using starry
t = np.linspace(-0.1, 0.1, 1000)
u = [0.3, 0.2]
light_curve = (
xo.LimbDarkLightCurve(u).get_light_curve(orbit=orbit, r=0.1, t=t, texp=0.02).eval()
)
# Note: the `eval` is needed because this is using Theano in
# the background
plt.plot(t, light_curve, color="C0", lw=2)
plt.ylabel("relative flux")
plt.xlabel("time [days]")
plt.xlim(t.min(), t.max());
But the real power comes from the fact that this is defined as a Theano operation so it can be combined with PyMC3 to do transit inference using Hamiltonian Monte Carlo.
In this section, we will construct a simple transit fit model using PyMC3 and then we will fit a two planet model to simulated data. To start, let’s randomly sample some periods and phases and then define the time sampling:
np.random.seed(123)
periods = np.random.uniform(5, 20, 2)
t0s = periods * np.random.rand(2)
t = np.arange(0, 80, 0.02)
yerr = 5e-4
Then, define the parameters. In this simple model, we’ll just fit for
the limb darkening parameters of the star, and the period, phase, impact
parameter, and radius ratio of the planets (note: this is already 10
parameters and running MCMC to convergence using
emcee would probably take at least an
hour). For the limb darkening, we’ll use a quadratic law as
parameterized by Kipping (2013).
This reparameterizations is implemented in exoplanet as custom PyMC3
distribution exoplanet.distributions.QuadLimbDark
.
import pymc3 as pm
with pm.Model() as model:
# The baseline flux
mean = pm.Normal("mean", mu=0.0, sd=1.0)
# The time of a reference transit for each planet
t0 = pm.Normal("t0", mu=t0s, sd=1.0, shape=2)
# The log period; also tracking the period itself
logP = pm.Normal("logP", mu=np.log(periods), sd=0.1, shape=2)
period = pm.Deterministic("period", pm.math.exp(logP))
# The Kipping (2013) parameterization for quadratic limb darkening paramters
u = xo.distributions.QuadLimbDark("u", testval=np.array([0.3, 0.2]))
r = pm.Uniform("r", lower=0.01, upper=0.1, shape=2, testval=np.array([0.04, 0.06]))
b = xo.distributions.ImpactParameter("b", ror=r, shape=2, testval=np.random.rand(2))
# Set up a Keplerian orbit for the planets
orbit = xo.orbits.KeplerianOrbit(period=period, t0=t0, b=b)
# Compute the model light curve using starry
light_curves = xo.LimbDarkLightCurve(u).get_light_curve(orbit=orbit, r=r, t=t)
light_curve = pm.math.sum(light_curves, axis=-1) + mean
# Here we track the value of the model light curve for plotting
# purposes
pm.Deterministic("light_curves", light_curves)
# In this line, we simulate the dataset that we will fit
y = xo.eval_in_model(light_curve)
y += yerr * np.random.randn(len(y))
# The likelihood function assuming known Gaussian uncertainty
pm.Normal("obs", mu=light_curve, sd=yerr, observed=y)
# Fit for the maximum a posteriori parameters given the simuated
# dataset
map_soln = xo.optimize(start=model.test_point)
optimizing logp for variables: [b, r, u, logP, t0, mean]
99it [00:05, 18.44it/s, logp=2.479354e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 24787.977771807487 -> 24793.5394256112
Now we can plot the simulated data and the maximum a posteriori model to make sure that our initialization looks ok.
plt.plot(t, y, ".k", ms=4, label="data")
for i, l in enumerate("bc"):
plt.plot(t, map_soln["light_curves"][:, i], lw=1, label="planet {0}".format(l))
plt.xlim(t.min(), t.max())
plt.ylabel("relative flux")
plt.xlabel("time [days]")
plt.legend(fontsize=10)
plt.title("map model");
Now, let’s sample from the posterior defined by this model. As usual,
there are strong covariances between some of the parameters so we’ll use
exoplanet.get_dense_nuts_step()
.
np.random.seed(42)
with model:
trace = pm.sample(
tune=3000,
draws=3000,
start=map_soln,
chains=4,
step=xo.get_dense_nuts_step(target_accept=0.9),
)
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [b, r, u, logP, t0, mean]
Sampling 4 chains: 100%|██████████| 24000/24000 [01:11<00:00, 336.21draws/s]
After sampling, it’s important that we assess convergence. We can do
that using the pymc3.summary
function:
pm.summary(trace, varnames=["period", "t0", "r", "b", "u", "mean"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
period__0 | 15.447902 | 0.002320 | 3.405381e-05 | 15.443095 | 15.452003 | 4118.731421 | 1.000025 |
period__1 | 9.292482 | 0.000300 | 2.758733e-06 | 9.291906 | 9.293056 | 10127.995186 | 0.999989 |
t0__0 | 3.503134 | 0.005865 | 8.604283e-05 | 3.492631 | 3.515298 | 4265.262327 | 1.000003 |
t0__1 | 5.121263 | 0.001362 | 1.175321e-05 | 5.118684 | 5.123971 | 12147.776077 | 0.999932 |
r__0 | 0.039576 | 0.001603 | 1.822197e-05 | 0.036484 | 0.042709 | 5682.073270 | 1.000130 |
r__1 | 0.058479 | 0.001045 | 1.356482e-05 | 0.056528 | 0.060473 | 5539.977100 | 1.000041 |
b__0 | 0.668903 | 0.048025 | 7.212047e-04 | 0.568592 | 0.741325 | 3319.747824 | 1.000335 |
b__1 | 0.402737 | 0.038150 | 4.939333e-04 | 0.329724 | 0.472222 | 5365.256844 | 0.999898 |
u__0 | 0.374367 | 0.208776 | 2.317348e-03 | 0.002057 | 0.728739 | 8783.841436 | 1.000424 |
u__1 | 0.273628 | 0.341880 | 4.379389e-03 | -0.332653 | 0.861825 | 5850.257955 | 1.000671 |
mean | 0.000005 | 0.000008 | 6.585676e-08 | -0.000011 | 0.000020 | 14670.582674 | 0.999842 |
That looks pretty good! Fitting this without exoplanet would have taken a lot more patience.
Now we can also look at the corner plot of some of that parameters of interest:
import corner
samples = pm.trace_to_dataframe(trace, varnames=["period", "r"])
truth = np.concatenate(xo.eval_in_model([period, r], model.test_point, model=model))
corner.corner(
samples, truths=truth, labels=["period 1", "period 2", "radius 1", "radius 2"]
);
Like in the radial velocity tutorial (Radial velocity fitting), we can make plots of the model predictions for each planet.
for n, letter in enumerate("bc"):
plt.figure()
# Get the posterior median orbital parameters
p = np.median(trace["period"][:, n])
t0 = np.median(trace["t0"][:, n])
# Compute the median of posterior estimate of the contribution from
# the other planet. Then we can remove this from the data to plot
# just the planet we care about.
other = np.median(trace["light_curves"][:, :, (n + 1) % 2], axis=0)
# Plot the folded data
x_fold = (t - t0 + 0.5 * p) % p - 0.5 * p
plt.errorbar(x_fold, y - other, yerr=yerr, fmt=".k", label="data", zorder=-1000)
# Plot the folded model
inds = np.argsort(x_fold)
inds = inds[np.abs(x_fold)[inds] < 0.3]
pred = trace["light_curves"][:, inds, n] + trace["mean"][:, None]
pred = np.median(pred, axis=0)
plt.plot(x_fold[inds], pred, color="C1", label="model")
# Annotate the plot with the planet's period
txt = "period = {0:.4f} +/- {1:.4f} d".format(
np.mean(trace["period"][:, n]), np.std(trace["period"][:, n])
)
plt.annotate(
txt,
(0, 0),
xycoords="axes fraction",
xytext=(5, 5),
textcoords="offset points",
ha="left",
va="bottom",
fontsize=12,
)
plt.legend(fontsize=10, loc=4)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("time since transit [days]")
plt.ylabel("relative flux")
plt.title("planet {0}".format(letter))
plt.xlim(-0.3, 0.3)
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model. This is especially important here because we have used
quite a few model components that should be cited.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:agol19, exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:kipping13, exoplanet:luger18, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
In this tutorial we’ll walk through the simplest astrometric example
with exoplanet
and then explain how to build up a more complicated
example with parallax measurements. For our dataset, we’ll use
astrometric and radial velocity observations of a binary star system.
Astrometric observations usually consist of measurements of the
separation and position angle of the secondary star (or directly imaged
exoplanet), relative to the primary star as a function of time. The
simplest astrometric orbit (in terms of number of parameters), describes
the orbit using a semi-major axis a_ang
measured in arcseconds,
since the distance to the system is assumed to be unknown. We’ll work
through this example first, then introduce the extra constraints
provided by parallax information.
First, let’s load and examine the data. We’ll use the astrometric measurements of HR 466 (HD 10009) as compiled by Pourbaix 1998. The speckle observations are originally from Hartkopf et al. 1996.
from astropy.io import ascii
from astropy.time import Time
# grab the formatted data and do some munging
dirname = "https://gist.github.com/iancze/262aba2429cb9aee3fd5b5e1a4582d4d/raw/c5fa5bc39fec90d2cc2e736eed479099e3e598e3/"
astro_data_full = ascii.read(
dirname + "astro.txt", format="csv", fill_values=[(".", "0")]
)
# convert UT date to JD
astro_dates = Time(astro_data_full["date"].data, format="decimalyear")
# Following the Pourbaix et al. 1998 analysis, we'll limit ourselves to the highest quality data
# since the raw collection of data outside of these ranges has some ambiguities in swapping
# the primary and secondary star
ind = (
(astro_dates.value > 1975.0)
& (astro_dates.value < 1999.73)
& (~astro_data_full["rho"].mask)
& (~astro_data_full["PA"].mask)
)
astro_data = astro_data_full[ind]
astro_yrs = astro_data["date"]
astro_dates.format = "jd"
astro_jds = astro_dates[ind].value
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 6 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 5 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 5 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 6 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "taiutc" yielded 6 of "dubious year (Note 4)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "d2dtf" yielded 6 of "dubious year (Note 5)" [astropy._erfa.core]
Many of these measurements in this heterogeneous dataset do not have reported error measurements. For these, we assume a modest uncertainty of \(1^\circ\) in position angle and \(0.01^{\prime\prime}\) in separation for the sake of specifying something, but we’ll include a jitter term for both of these measurements as well. The scatter in points around the final solution will be a decent guide of what the measurement uncertainties actually were.
import numpy as np
astro_data["rho_err"][astro_data["rho_err"].mask == True] = 0.01
astro_data["PA_err"][astro_data["PA_err"].mask == True] = 1.0
# Convert all masked frames to be raw np arrays, since theano has issues with astropy masked columns
rho_data = np.ascontiguousarray(astro_data["rho"], dtype=float) # arcsec
rho_err = np.ascontiguousarray(astro_data["rho_err"], dtype=float)
# The position angle measurements come in degrees in the range [0, 360].
# We'll convert this to radians in the range [-pi, pi]
deg = np.pi / 180.0
theta_data = np.ascontiguousarray(astro_data["PA"] * deg, dtype=float)
theta_data[theta_data > np.pi] -= 2 * np.pi
theta_err = np.ascontiguousarray(astro_data["PA_err"] * deg) # radians
The conventions describing the orientation of the orbits are described in detail in the exoplanet paper; we summarize them briefly here. Generally, we follow the conventions from Pourbaix et al. 1998, which are a consistent set conforming to the right-hand-rule and the conventions of the visual binary field, where the ascending node is that where the secondary is receeding from the observer (without radial velocity information, there is a \(\pi\) degeneracy in which node is ascending, and so common practice in the literature is to report a value in the range \([0,\pi]\)). The orbital inclination ranges from \([0, \pi\)]. \(i = 0\) describes a face-on orbit rotating counter-clockwise on the sky plane, while \(i=\pi\) describes a face-on orbit rotating clockwise on the sky. \(i = \pi/2\) is an edge-on orbit.
The observer frame \(X\), \(Y\), \(Z\) is oriented on the sky such that \(+Z\) points towards the observer, \(X\) is the north axis, and \(Y\) is the east axis. All angles are measured in radians, and the position angle is returned in the range \([-\pi, \pi]\), which is the degrees east of north (be sure to check your data is in this format too!) The radial velocity is still defined such that a positive radial velocity corresponds to motion away from the observer.
In an astrometric-only orbit, it is common practice in the field to report \(\omega = \omega_\mathrm{secondary}\), whereas with an RV orbit it is generally common practice to report \(\omega = \omega_\mathrm{primary}\). The result is that unless the authors specify what they’re using, in a joint astrometric-RV orbit there is an ambiguity to which \(\omega\) the authors mean, since \(\omega_\mathrm{primary} = \omega_\mathrm{secondary} + \pi\). To standardize this across the exoplanet package, in all orbits (including astrometric-only) \(\omega = \omega_\mathrm{primary}\).
import matplotlib.pyplot as plt
# Make a plot of the astrometric data on the sky
# The convention is that North is up and East is left
fig, ax = plt.subplots(nrows=1, figsize=(4, 4))
xs = rho_data * np.cos(theta_data) # X is north
ys = rho_data * np.sin(theta_data) # Y is east
ax.plot(ys, xs, ".k")
ax.set_ylabel(r"$\Delta \delta$ ['']")
ax.set_xlabel(r"$\Delta \alpha \cos \delta$ ['']")
ax.invert_xaxis()
ax.plot(0, 0, "k*")
ax.set_aspect("equal", "datalim")
/mnt/home/dforeman/miniconda3/envs/autoexoplanet/lib/python3.7/site-packages/matplotlib/font_manager.py:1241: UserWarning: findfont: Font family ['cursive'] not found. Falling back to DejaVu Sans.
(prop.get_family(), self.defaultFamily[fontext]))
The plot on the sky is helpful to look at, but the “raw” measurements are the values of \(\rho\) (separation) and \(\theta\) (also called P.A., position angle) that we listed in our data table, and that the measurement uncertainties live on these values as nice Gaussians. So, to visualize this space more clearly, we can plot \(\rho\) vs. time and P.A. vs. time.
fig, ax = plt.subplots(nrows=2, sharex=True)
ax[0].errorbar(astro_yrs, rho_data, yerr=rho_err, fmt=".k", lw=1, ms=5)
ax[0].set_ylabel(r'$\rho\,$ ["]')
ax[1].errorbar(astro_yrs, theta_data, yerr=theta_err, fmt=".k", lw=1, ms=5)
ax[1].set_ylabel(r"P.A. [radians]")
ax[1].set_xlabel("time [years]");
To get started, let’s import the relative packages from exoplanet, plot up a preliminary orbit from the literature, and then sample to find the best parameters.
Note
Orbits in exoplanet generally specify the semi-major axis in units of solar radii R_sun. For transits and RV orbits, you usually have enough external information (e.g., estimate of stellar mass from spectral type) to put a physical scale onto the orbit. For the most basic of astrometric orbits without parallax information, however, this information can be lacking and thus it makes sense to fit for the semi-major axis in units of arcseconds. But, exoplanet is modeling a real orbit (where semi-major axis is in units of R_sun), so we do need to at least provide a fake parallax to convert from arcseconds to R_sun.
import pymc3 as pm
import theano
import theano.tensor as tt
import exoplanet as xo
from exoplanet.distributions import Angle
from astropy import constants
# conversion constant from au to R_sun
au_to_R_sun = (constants.au / constants.R_sun).value
# Just to get started, let's take a look at the orbit using the best-fit parameters from Pourbaix et al. 1998
# Orbital elements from Pourbaix et al. 1998
# For the relative astrometric fit, we only need the following parameters
a_ang = 0.324 # arcsec
parallax = 1 # arcsec (meaningless choice for now)
a = a_ang * au_to_R_sun / parallax
e = 0.798
i = 96.0 * deg # [rad]
omega = 251.6 * deg - np.pi # Pourbaix reports omega_2, but we want omega_1
Omega = 159.6 * deg
P = 28.8 * 365.25 # days
T0 = Time(1989.92, format="decimalyear")
T0.format = "jd"
T0 = T0.value # [Julian Date]
# instantiate the orbit
orbit = xo.orbits.KeplerianOrbit(
a=a, t_periastron=T0, period=P, incl=i, ecc=e, omega=omega, Omega=Omega
)
# The position functions take an optional argument parallax to convert from
# physical units back to arcseconds
t = np.linspace(T0 - P, T0 + P, num=200) # days
rho, theta = theano.function([], orbit.get_relative_angles(t, parallax))()
# Plot the orbit
fig, ax = plt.subplots(nrows=1, figsize=(4, 4))
xs = rho * np.cos(theta) # X is north
ys = rho * np.sin(theta) # Y is east
ax.plot(ys, xs, color="C0", lw=1)
# plot the data
xs = rho_data * np.cos(theta_data) # X is north
ys = rho_data * np.sin(theta_data) # Y is east
ax.plot(ys, xs, ".k")
ax.set_ylabel(r"$\Delta \delta$ ['']")
ax.set_xlabel(r"$\Delta \alpha \cos \delta$ ['']")
ax.invert_xaxis()
ax.plot(0, 0, "k*")
ax.set_aspect("equal", "datalim")
ax.set_title("initial orbit")
fig, ax = plt.subplots(nrows=2, sharex=True, figsize=(6, 6))
ax[0].errorbar(astro_jds, rho_data, yerr=rho_err, fmt=".k", lw=1, ms=5)
ax[0].plot(t, rho, color="C0", lw=1)
ax[0].set_ylabel(r'$\rho\,$ ["]')
ax[0].set_title("initial orbit")
ax[1].errorbar(astro_jds, theta_data, yerr=theta_err, fmt=".k", lw=1, ms=5)
ax[1].plot(t, theta, color="C0", lw=1)
ax[1].set_ylabel(r"P.A. [radians]")
ax[1].set_xlabel("time [JD]");
Now that we have an initial orbit, we can set the model up using PyMC3 to do inference.
yr = 365.25
# for predicted orbits
t_fine = np.linspace(astro_jds.min() - 500, astro_jds.max() + 500, num=1000)
def get_model(parallax=None):
with pm.Model() as model:
if parallax is None:
# Without an actual parallax measurement, we can model the orbit in units of arcseconds
# by providing a fake_parallax and conversion constant
plx = 1 # arcsec
else:
# Below we will run a version of this model where a measurement of parallax is provided
# The measurement is in milliarcsec
m_plx = pm.Bound(pm.Normal, lower=0, upper=100)(
"m_plx", mu=parallax[0], sd=parallax[1], testval=parallax[0]
)
plx = pm.Deterministic("plx", 1e-3 * m_plx)
a_ang = pm.Uniform("a_ang", 0.1, 1.0, testval=0.324)
a = pm.Deterministic("a", a_ang / plx)
# We expect the period to be somewhere in the range of 25 years,
# so we'll set a broad prior on logP
logP = pm.Normal("logP", mu=np.log(25 * yr), sd=10.0, testval=np.log(28.8 * yr))
P = pm.Deterministic("P", tt.exp(logP))
# For astrometric-only fits, it's generally better to fit in
# p = (Omega + omega)/2 and m = (Omega - omega)/2 instead of omega and Omega
# directly
omega0 = 251.6 * deg - np.pi
Omega0 = 159.6 * deg
p = Angle("p", testval=0.5 * (Omega0 + omega0))
m = Angle("m", testval=0.5 * (Omega0 - omega0))
omega = pm.Deterministic("omega", p - m)
Omega = pm.Deterministic("Omega", p + m)
# For these orbits, it can also be better to fit for a phase angle
# (relative to a reference time) instead of the time of periasteron
# passage directly
phase = Angle("phase", testval=0.0)
tperi = pm.Deterministic("tperi", T0 + P * phase / (2 * np.pi))
# Geometric uiform prior on cos(incl)
cos_incl = pm.Uniform("cos_incl", lower=-1, upper=1, testval=np.cos(96.0 * deg))
incl = pm.Deterministic("incl", tt.arccos(cos_incl))
ecc = pm.Uniform("ecc", lower=0.0, upper=1.0, testval=0.798)
# Set up the orbit
orbit = xo.orbits.KeplerianOrbit(
a=a * au_to_R_sun,
t_periastron=tperi,
period=P,
incl=incl,
ecc=ecc,
omega=omega,
Omega=Omega,
)
if parallax is not None:
pm.Deterministic("M_tot", orbit.m_total)
# Compute the model in rho and theta
rho_model, theta_model = orbit.get_relative_angles(astro_jds, plx)
pm.Deterministic("rho_model", rho_model)
pm.Deterministic("theta_model", theta_model)
# Add jitter terms to both separation and position angle
log_rho_s = pm.Normal("log_rho_s", mu=np.log(np.median(rho_err)), sd=5.0)
log_theta_s = pm.Normal("log_theta_s", mu=np.log(np.median(theta_err)), sd=5.0)
rho_tot_err = tt.sqrt(rho_err ** 2 + tt.exp(2 * log_rho_s))
theta_tot_err = tt.sqrt(theta_err ** 2 + tt.exp(2 * log_theta_s))
# define the likelihood function, e.g., a Gaussian on both rho and theta
pm.Normal("rho_obs", mu=rho_model, sd=rho_tot_err, observed=rho_data)
# We want to be cognizant of the fact that theta wraps so the following is equivalent to
# pm.Normal("obs_theta", mu=theta_model, observed=theta_data, sd=theta_tot_err)
# but takes into account the wrapping. Thanks to Rob de Rosa for the tip.
theta_diff = tt.arctan2(
tt.sin(theta_model - theta_data), tt.cos(theta_model - theta_data)
)
pm.Normal("theta_obs", mu=theta_diff, sd=theta_tot_err, observed=0.0)
# Set up predicted orbits for later plotting
rho_dense, theta_dense = orbit.get_relative_angles(t_fine, plx)
rho_save = pm.Deterministic("rho_save", rho_dense)
theta_save = pm.Deterministic("theta_save", theta_dense)
# Optimize to find the initial parameters
map_soln = model.test_point
map_soln = xo.optimize(map_soln, vars=[log_rho_s, log_theta_s])
map_soln = xo.optimize(map_soln, vars=[phase])
map_soln = xo.optimize(map_soln, vars=[p, m, ecc])
map_soln = xo.optimize(map_soln, vars=[logP, a_ang, phase])
map_soln = xo.optimize(map_soln)
return model, map_soln
model, map_soln = get_model()
optimizing logp for variables: [log_theta_s, log_rho_s]
11it [00:04, 2.71it/s, logp=1.471440e+02]
message: Optimization terminated successfully.
logp: 104.85554109304441 -> 147.14399186005338
optimizing logp for variables: [phase]
23it [00:00, 29.30it/s, logp=1.676422e+02]
message: Optimization terminated successfully.
logp: 147.14399186005338 -> 167.64220598197195
optimizing logp for variables: [ecc, m, p]
35it [00:00, 48.51it/s, logp=2.100634e+02]
message: Optimization terminated successfully.
logp: 167.64220598197198 -> 210.06340668297906
optimizing logp for variables: [phase, a_ang, logP]
12it [00:01, 11.83it/s, logp=2.105014e+02]
message: Optimization terminated successfully.
logp: 210.0634066829791 -> 210.5013698914921
optimizing logp for variables: [log_theta_s, log_rho_s, ecc, cos_incl, phase, m, p, logP, a_ang]
103it [00:01, 100.88it/s, logp=2.150212e+02]
message: Optimization terminated successfully.
logp: 210.5013698914921 -> 215.02117742211766
Now that we have a maximum a posteriori estimate of the parameters, let’s take a look at the results to make sure that they seem reasonable.
ekw = dict(fmt=".k", lw=0.5)
fig, ax = plt.subplots(nrows=4, sharex=True, figsize=(6, 8))
ax[0].set_ylabel(r'$\rho\,$ ["]')
ax[1].set_ylabel(r"$\rho$ residuals")
ax[2].set_ylabel(r"P.A. [radians]")
ax[3].set_ylabel(r"P.A. residuals")
tot_rho_err = np.sqrt(rho_err ** 2 + np.exp(2 * map_soln["log_rho_s"]))
tot_theta_err = np.sqrt(theta_err ** 2 + np.exp(2 * map_soln["log_theta_s"]))
ax[0].errorbar(astro_jds, rho_data, yerr=tot_rho_err, **ekw)
ax[0].plot(t_fine, map_soln["rho_save"], "C1")
ax[1].axhline(0.0, color="0.5")
ax[1].errorbar(astro_jds, rho_data - map_soln["rho_model"], yerr=tot_rho_err, **ekw)
ax[2].plot(t_fine, map_soln["theta_save"], "C1")
ax[2].errorbar(astro_jds, theta_data, yerr=tot_theta_err, **ekw)
ax[3].axhline(0.0, color="0.5")
ax[3].errorbar(
astro_jds, theta_data - map_soln["theta_model"], yerr=tot_theta_err, **ekw
)
ax[3].set_xlim(t_fine[0], t_fine[-1])
ax[0].set_title("map orbit");
Now let’s sample the posterior.
np.random.seed(1234)
with model:
trace = pm.sample(
tune=5000,
draws=4000,
start=map_soln,
step=xo.get_dense_nuts_step(target_accept=0.9, adaptation_window=201),
)
Multiprocess sampling (4 chains in 4 jobs) NUTS: [log_theta_s, log_rho_s, ecc, cos_incl, phase, m, p, logP, a_ang] Sampling 4 chains: 100%|██████████| 36000/36000 [01:54<00:00, 315.59draws/s] There were 2 divergences after tuning. Increase target_accept or reparameterize. The number of effective samples is smaller than 25% for some parameters.
First we can check the convergence for some of the key parameters.
pm.summary(trace, varnames=["P", "tperi", "a_ang", "omega", "Omega", "incl", "ecc"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
P | 1.038560e+04 | 130.596061 | 2.568042 | 1.017395e+04 | 1.066222e+04 | 2240.618353 | 1.000838 |
tperi | 2.447861e+06 | 19.851352 | 0.214848 | 2.447824e+06 | 2.447902e+06 | 9555.402376 | 1.000071 |
a_ang | 3.176604e-01 | 0.007585 | 0.000086 | 3.032541e-01 | 3.325282e-01 | 7520.206198 | 0.999966 |
omega | 1.234878e+00 | 0.013461 | 0.000147 | 1.209089e+00 | 1.260878e+00 | 8752.570217 | 0.999960 |
Omega | 2.786880e+00 | 0.011519 | 0.000106 | 2.764110e+00 | 2.809586e+00 | 10264.979227 | 1.000034 |
incl | 1.691413e+00 | 0.006185 | 0.000080 | 1.678347e+00 | 1.703006e+00 | 6343.849650 | 1.000109 |
ecc | 7.757083e-01 | 0.011919 | 0.000142 | 7.519233e-01 | 7.988892e-01 | 5257.964087 | 1.000226 |
That looks pretty good. Now here’s a corner plot showing the covariances between parameters.
import corner
samples = pm.trace_to_dataframe(trace, varnames=["ecc"])
samples["$P$ [yr]"] = trace["P"] / yr
samples["$T_\mathrm{peri} - T_0$ [day]"] = trace["tperi"] - T0
samples["$a$ [arcsec]"] = trace["a_ang"]
samples["$\omega$ [deg]"] = (trace["omega"] / deg) % 360
samples["$\Omega$ [deg]"] = (trace["Omega"] / deg) % 360
samples["$i$ [deg]"] = (trace["incl"] / deg) % 360
samples["$e$"] = samples["ecc"]
del samples["ecc"]
corner.corner(samples);
Finally, we can plot the posterior constraints on \(\rho\) and \(\theta\). This figure is much like the one for the MAP solution above, but this time the orange is a contour (not a line) showing the 68% credible region for the model.
ekw = dict(fmt=".k", lw=0.5)
fig, ax = plt.subplots(nrows=2, sharex=True, figsize=(6, 6))
ax[0].set_ylabel(r'$\rho\,$ ["]')
ax[1].set_ylabel(r"P.A. [radians]")
tot_rho_err = np.sqrt(rho_err ** 2 + np.exp(2 * np.median(trace["log_rho_s"], axis=0)))
tot_theta_err = np.sqrt(
theta_err ** 2 + np.exp(2 * np.median(trace["log_theta_s"], axis=0))
)
ax[0].errorbar(astro_jds, rho_data, yerr=tot_rho_err, **ekw)
q = np.percentile(trace["rho_save"], [16, 84], axis=0)
ax[0].fill_between(t_fine, q[0], q[1], color="C1", alpha=0.8, lw=0)
ax[1].errorbar(astro_jds, theta_data, yerr=tot_theta_err, **ekw)
q = np.percentile(trace["theta_save"], [16, 84], axis=0)
ax[1].fill_between(t_fine, q[0], q[1], color="C1", alpha=0.8, lw=0)
ax[-1].set_xlim(t_fine[0], t_fine[-1])
ax[0].set_title("posterior inferences");
As we can see from the narrow range of orbits (the orange swath appears like a thin line), the orbit is actually highly constrained by the astrometry. We also see two outlier epochs in the vicinity of 2445000 - 2447000, since adjacent epochs seem to be right on the orbit. It’s likely the uncertainties were not estimated correctly for these, and the simlplistic jitter model we implemented isn’t sophisticated to apply more weight to only these discrepant points.
While this is encouraging that we fit an astrometric orbit, a simple astrometric fit to just \(\rho\) and \(\theta\) isn’t actually that physically satisfying, since many of the orbital parameters simply have to do with the orientation relative to us (\(i\), \(\omega\), and \(\Omega\)). The only truely intrinsic parameters are \(P\) and \(e\). To learn more about some of the physical parameters, such as the total mass of the system, we’d like to incorporate distance information to put a physical scale to the problem.
The Gaia DR2 parallax is \(\varpi = 24.05 \pm 0.45\) mas.
We can use exactly the same model as above with only an added parallax constraint:
plx_model, plx_map_soln = get_model(parallax=[24.05, 0.45])
optimizing logp for variables: [log_theta_s, log_rho_s]
11it [00:01, 6.20it/s, logp=1.499286e+02]
message: Optimization terminated successfully.
logp: 107.64015029566296 -> 149.92860106267193
optimizing logp for variables: [phase]
23it [00:01, 16.15it/s, logp=1.704268e+02]
message: Optimization terminated successfully.
logp: 149.92860106267193 -> 170.4268151845905
optimizing logp for variables: [ecc, m, p]
35it [00:00, 44.74it/s, logp=2.128480e+02]
message: Optimization terminated successfully.
logp: 170.42681518459054 -> 212.8480158855976
optimizing logp for variables: [phase, a_ang, logP]
12it [00:00, 16.25it/s, logp=2.132860e+02]
message: Optimization terminated successfully.
logp: 212.8480158855976 -> 213.28597909410882
optimizing logp for variables: [log_theta_s, log_rho_s, ecc, cos_incl, phase, m, p, logP, a_ang, m_plx]
115it [00:01, 113.38it/s, logp=2.178059e+02]
message: Desired error not necessarily achieved due to precision loss.
logp: 213.28597909410882 -> 217.8058683350306
np.random.seed(5432)
with plx_model:
plx_trace = pm.sample(
tune=5000,
draws=4000,
start=plx_map_soln,
step=xo.get_dense_nuts_step(target_accept=0.9, start=plx_map_soln),
)
Multiprocess sampling (4 chains in 4 jobs) NUTS: [log_theta_s, log_rho_s, ecc, cos_incl, phase, m, p, logP, a_ang, m_plx] Sampling 4 chains: 100%|██████████| 36000/36000 [01:53<00:00, 317.05draws/s] There was 1 divergence after tuning. Increase target_accept or reparameterize. The number of effective samples is smaller than 25% for some parameters.
Check the convergence diagnostics.
pm.summary(
plx_trace,
varnames=["P", "tperi", "a_ang", "omega", "Omega", "incl", "ecc", "M_tot", "plx"],
)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
P | 1.038256e+04 | 124.916056 | 2.067192 | 1.016925e+04 | 1.064818e+04 | 3826.618493 | 1.000518 |
tperi | 2.447861e+06 | 19.848048 | 0.219461 | 2.447822e+06 | 2.447900e+06 | 9609.039231 | 1.000113 |
a_ang | 3.176802e-01 | 0.007623 | 0.000085 | 3.027949e-01 | 3.326448e-01 | 8640.489878 | 1.000150 |
omega | 1.234978e+00 | 0.013582 | 0.000155 | 1.208285e+00 | 1.261611e+00 | 8058.727070 | 1.000193 |
Omega | 2.786885e+00 | 0.011637 | 0.000126 | 2.763404e+00 | 2.809243e+00 | 10790.652837 | 1.000192 |
incl | 1.691449e+00 | 0.006090 | 0.000063 | 1.679782e+00 | 1.703857e+00 | 9221.379543 | 0.999979 |
ecc | 7.756602e-01 | 0.012000 | 0.000131 | 7.527246e-01 | 8.002851e-01 | 7944.595520 | 1.000063 |
M_tot | 2.863766e+00 | 0.269617 | 0.003001 | 2.362946e+00 | 3.413032e+00 | 9369.328149 | 1.000088 |
plx | 2.405191e-02 | 0.000454 | 0.000004 | 2.316966e-02 | 2.493864e-02 | 13465.793319 | 0.999971 |
And make the corner plot for the physical parameters.
samples = pm.trace_to_dataframe(plx_trace, varnames=["ecc"])
samples["$P$ [yr]"] = plx_trace["P"] / yr
samples["$T_\mathrm{peri} - T_0$ [day]"] = plx_trace["tperi"] - T0
samples["$a$ [au]"] = plx_trace["a"]
samples["$M_\mathrm{tot}$ [$M_\odot$]"] = plx_trace["M_tot"]
samples["$e$"] = plx_trace["ecc"]
del samples["ecc"]
corner.corner(samples);
/mnt/home/dforeman/miniconda3/envs/autoexoplanet/lib/python3.7/site-packages/matplotlib/mathtext.py:827: MathTextWarning: Substituting with a symbol from Computer Modern.
MathTextWarning)
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
PyMC3 has support for Gaussian Processes (GPs), but this implementation is too slow for many applications in time series astrophysics. So exoplanet comes with an implementation of scalable GPs powered by celerite. More information about the algorithm can be found in the celerite docs and in the papers (Paper 1 and Paper 2), but this tutorial will give a hands on demo of how to use celerite in PyMC3.
Note
For the best results, we generally recommend the use of the exoplanet.gp.terms.SHOTerm
, exoplanet.gp.terms.Matern32Term
, and exoplanet.gp.terms.RotationTerm
“terms” because the other terms tend to have unphysical behavior at high frequency.
Let’s start with the quickstart demo from the celerite
docs.
We’ll fit the following simulated dataset using the sum of two
exoplanet.gp.terms.SHOTerm
objects.
First, generate the simulated data:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
t = np.sort(
np.append(np.random.uniform(0, 3.8, 57), np.random.uniform(5.5, 10, 68))
) # The input coordinates must be sorted
yerr = np.random.uniform(0.08, 0.22, len(t))
y = 0.2 * (t - 5) + np.sin(3 * t + 0.1 * (t - 5) ** 2) + yerr * np.random.randn(len(t))
true_t = np.linspace(0, 10, 5000)
true_y = 0.2 * (true_t - 5) + np.sin(3 * true_t + 0.1 * (true_t - 5) ** 2)
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0, label="data")
plt.plot(true_t, true_y, "k", lw=1.5, alpha=0.3, label="truth")
plt.legend(fontsize=12)
plt.xlabel("t")
plt.ylabel("y")
plt.xlim(0, 10)
plt.ylim(-2.5, 2.5);
This plot shows the simulated data as black points with error bars and the true function is shown as a gray line.
Now let’s build the PyMC3 model that we’ll use to fit the data. We can see that there’s some roughly periodic signal in the data as well as a longer term trend. To capture these two features, we will model this as a mixture of two stochastically driven simple harmonic oscillators (SHO) with the power spectrum:
The first term is exoplanet.gp.terms.SHOterm
with
\(Q=1/\sqrt{2}\) and the second is regular
exoplanet.gp.terms.SHOterm
. This model has 5 free parameters
(\(S_1\), \(\omega_1\), \(S_2\), \(\omega_2\), and
\(Q\)) and they must all be positive so we’ll fit for the log of
each parameter. Using exoplanet, this is how you would build this
model, choosing more or less arbitrary initial values for the
parameters.
import pymc3 as pm
import theano.tensor as tt
from exoplanet.gp import terms, GP
with pm.Model() as model:
logS1 = pm.Normal("logS1", mu=0.0, sd=15.0, testval=np.log(np.var(y)))
logw1 = pm.Normal("logw1", mu=0.0, sd=15.0, testval=np.log(3.0))
logS2 = pm.Normal("logS2", mu=0.0, sd=15.0, testval=np.log(np.var(y)))
logw2 = pm.Normal("logw2", mu=0.0, sd=15.0, testval=np.log(3.0))
logQ = pm.Normal("logQ", mu=0.0, sd=15.0, testval=0)
# Set up the kernel an GP
kernel = terms.SHOTerm(log_S0=logS1, log_w0=logw1, Q=1.0 / np.sqrt(2))
kernel += terms.SHOTerm(log_S0=logS2, log_w0=logw2, log_Q=logQ)
gp = GP(kernel, t, yerr ** 2)
# Add a custom "potential" (log probability function) with the GP likelihood
pm.Potential("gp", gp.log_likelihood(y))
A few comments here:
term
interface in exoplanet only accepts keyword arguments
with names given by the parameter_names
property of the term. But
it will also interpret keyword arguments with the name prefaced by
log_
to be the log of the parameter. For example, in this case,
we used log_S0
as the parameter for each term, but
S0=tt.exp(log_S0)
would have been equivalent. This is useful
because many of the parameters are required to be positive so fitting
the log of those parameters is often best.exoplanet.gp.GP
constructor
should be the variance to add along the diagonal, not the standard
deviation as in the original celerite
implementation.exoplanet.gp.GP
constructor takes an optional
argument J
which specifies the width of the problem if it is
known at compile time. Just to be confusing, this is actually two
times the J
from the celerite
paper. There are various
technical reasons why this is difficult to work out in general and
this code will always work if you don’t provide a value for J
,
but you can get much better performance (especially for small J
)
if you know what it will be for your problem. In general, most terms
cost J=2
with the exception of a
exoplanet.gp.terms.RealTerm
(which costs J=1) and a
exoplanet.gp.terms.RotationTerm
(which costs J=4).To start, let’s fit for the maximum a posteriori (MAP) parameters and look the the predictions that those make.
import exoplanet as xo
with model:
map_soln = xo.optimize(start=model.test_point)
optimizing logp for variables: [logQ, logw2, logS2, logw1, logS1]
52it [00:00, 67.21it/s, logp=-1.650959e+00]
message: Optimization terminated successfully.
logp: -41.068631281249985 -> -1.6509594267148024
We’ll use the exoplanet.eval_in_model()
function to evaluate the
MAP GP model.
with model:
mu, var = xo.eval_in_model(gp.predict(true_t, return_var=True), map_soln)
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0, label="data")
plt.plot(true_t, true_y, "k", lw=1.5, alpha=0.3, label="truth")
# Plot the prediction and the 1-sigma uncertainty
sd = np.sqrt(var)
art = plt.fill_between(true_t, mu + sd, mu - sd, color="C1", alpha=0.3)
art.set_edgecolor("none")
plt.plot(true_t, mu, color="C1", label="prediction")
plt.legend(fontsize=12)
plt.xlabel("t")
plt.ylabel("y")
plt.xlim(0, 10)
plt.ylim(-2.5, 2.5);
Now we can sample this model using PyMC3. There are strong covariances
between the parameters so we’ll use the custom
exoplanet.get_dense_nuts_step()
to fit for these covariances
during burn-in.
with model:
trace = pm.sample(
tune=2000, draws=2000, start=map_soln, chains=2, step=xo.get_dense_nuts_step()
)
Multiprocess sampling (2 chains in 4 jobs) NUTS: [logQ, logw2, logS2, logw1, logS1] Sampling 2 chains: 100%|██████████| 8000/8000 [00:17<00:00, 446.66draws/s] There were 12 divergences after tuning. Increase target_accept or reparameterize. There were 8 divergences after tuning. Increase target_accept or reparameterize.
Now we can compute the standard PyMC3 convergence statistics (using
pymc3.summary
) and make a trace plot (using pymc3.traceplot
).
pm.traceplot(trace)
pm.summary(trace)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
logS1 | 4.836030 | 3.612722 | 0.087622 | -0.903379 | 11.722581 | 1680.079796 | 0.999841 |
logw1 | -2.422710 | 1.298087 | 0.030888 | -5.055399 | -0.073065 | 1794.936743 | 0.999804 |
logS2 | -4.061131 | 0.463132 | 0.009344 | -4.919930 | -3.131193 | 2483.083671 | 0.999774 |
logw2 | 1.130444 | 0.049938 | 0.000905 | 1.030003 | 1.222901 | 2804.671825 | 1.000663 |
logQ | 2.837560 | 1.136674 | 0.032128 | 1.057137 | 5.263124 | 1151.637687 | 1.000556 |
That all looks pretty good, but I like to make two other results plots: (1) a corner plot and (2) a posterior predictive plot.
The corner plot is easy using pymc3.trace_to_dataframe
and I find it
useful for understanding the covariances between parameters when
debugging.
import corner
samples = pm.trace_to_dataframe(trace)
corner.corner(samples);
The “posterior predictive” plot that I like to make isn’t the same as a
“posterior predictive check” (which can be a good thing to do too).
Instead, I like to look at the predictions of the model in the space of
the data. We could have saved these predictions using a
pymc3.Deterministic
distribution, but that adds some overhead to
each evaluation of the model so instead, we can use
exoplanet.utils.get_samples_from_trace()
to loop over a few
random samples from the chain and then the
exoplanet.eval_in_model()
function to evaluate the prediction
just for those samples.
# Generate 50 realizations of the prediction sampling randomly from the chain
N_pred = 50
pred_mu = np.empty((N_pred, len(true_t)))
pred_var = np.empty((N_pred, len(true_t)))
with model:
pred = gp.predict(true_t, return_var=True)
for i, sample in enumerate(xo.get_samples_from_trace(trace, size=N_pred)):
pred_mu[i], pred_var[i] = xo.eval_in_model(pred, sample)
# Plot the predictions
for i in range(len(pred_mu)):
mu = pred_mu[i]
sd = np.sqrt(pred_var[i])
label = None if i else "prediction"
art = plt.fill_between(true_t, mu + sd, mu - sd, color="C1", alpha=0.1)
art.set_edgecolor("none")
plt.plot(true_t, mu, color="C1", label=label, alpha=0.1)
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0, label="data")
plt.plot(true_t, true_y, "k", lw=1.5, alpha=0.3, label="truth")
plt.legend(fontsize=12)
plt.xlabel("t")
plt.ylabel("y")
plt.xlim(0, 10)
plt.ylim(-2.5, 2.5);
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:exoplanet, exoplanet:foremanmackey17, exoplanet:foremanmackey18, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
When fitting exoplanets, we also need to fit for the stellar variability and Gaussian Processes (GPs) are often a good descriptive model for this variation. PyMC3 has support for all sorts of general GP models, but exoplanet includes support for scalable 1D GPs (see Scalable Gaussian processes in PyMC3 for more info) that can work with large datasets. In this tutorial, we go through the process of modeling the light curve of a rotating star observed by Kepler using exoplanet.
First, let’s download and plot the data:
import numpy as np
import lightkurve as lk
import matplotlib.pyplot as plt
lcf = lk.search_lightcurvefile("KIC 5809890", quarter=13).download(
quality_bitmask="hardest"
)
lc = lcf.PDCSAP_FLUX.normalize().remove_nans().remove_outliers()
x = np.ascontiguousarray(lc.time, dtype=np.float64)
y = np.ascontiguousarray(lc.flux, dtype=np.float64)
yerr = np.ascontiguousarray(lc.flux_err, dtype=np.float64)
mu = np.mean(y)
y = (y / mu - 1) * 1e3
yerr = yerr * 1e3 / mu
plt.plot(x, y, "k")
plt.xlim(x.min(), x.max())
plt.xlabel("time [days]")
plt.ylabel("relative flux [ppt]")
plt.title("KIC 5809890");
This looks like the light curve of a rotating star, and it has been shown that it is possible to model this variability by using a quasiperiodic Gaussian process. To start with, let’s get an estimate of the rotation period using the Lomb-Scargle periodogram:
import exoplanet as xo
results = xo.estimators.lomb_scargle_estimator(
x, y, max_peaks=1, min_period=5.0, max_period=100.0, samples_per_peak=50
)
peak = results["peaks"][0]
freq, power = results["periodogram"]
plt.plot(1 / freq, power, "k")
plt.axvline(peak["period"], color="k", lw=4, alpha=0.3)
plt.xlim((1 / freq).min(), (1 / freq).max())
plt.yticks([])
plt.xlabel("period [days]")
plt.ylabel("power");
Now, using this initialization, we can set up the GP model in
exoplanet. We’ll use the exoplanet.gp.terms.RotationTerm
kernel that is a mixture of two simple harmonic oscillators with periods
separated by a factor of two. As you can see from the periodogram above,
this might be a good model for this light curve and I’ve found that it
works well in many cases.
import pymc3 as pm
import theano.tensor as tt
with pm.Model() as model:
# The mean flux of the time series
mean = pm.Normal("mean", mu=0.0, sd=10.0)
# A jitter term describing excess white noise
logs2 = pm.Normal("logs2", mu=2 * np.log(np.mean(yerr)), sd=2.0)
# A term to describe the non-periodic variability
logSw4 = pm.Normal("logSw4", mu=np.log(np.var(y)), sd=5.0)
logw0 = pm.Normal("logw0", mu=np.log(2 * np.pi / 10), sd=5.0)
# The parameters of the RotationTerm kernel
logamp = pm.Normal("logamp", mu=np.log(np.var(y)), sd=5.0)
BoundedNormal = pm.Bound(pm.Normal, lower=0.0, upper=np.log(50))
logperiod = BoundedNormal("logperiod", mu=np.log(peak["period"]), sd=5.0)
logQ0 = pm.Normal("logQ0", mu=1.0, sd=10.0)
logdeltaQ = pm.Normal("logdeltaQ", mu=2.0, sd=10.0)
mix = xo.distributions.UnitUniform("mix")
# Track the period as a deterministic
period = pm.Deterministic("period", tt.exp(logperiod))
# Set up the Gaussian Process model
kernel = xo.gp.terms.SHOTerm(log_Sw4=logSw4, log_w0=logw0, Q=1 / np.sqrt(2))
kernel += xo.gp.terms.RotationTerm(
log_amp=logamp, period=period, log_Q0=logQ0, log_deltaQ=logdeltaQ, mix=mix
)
gp = xo.gp.GP(kernel, x, yerr ** 2 + tt.exp(logs2))
# Compute the Gaussian Process likelihood and add it into the
# the PyMC3 model as a "potential"
pm.Potential("loglike", gp.log_likelihood(y - mean))
# Compute the mean model prediction for plotting purposes
pm.Deterministic("pred", gp.predict())
# Optimize to find the maximum a posteriori parameters
map_soln = xo.optimize(start=model.test_point)
optimizing logp for variables: [mix, logdeltaQ, logQ0, logperiod, logamp, logw0, logSw4, logs2, mean]
163it [00:11, 13.59it/s, logp=7.031724e+02]
message: Desired error not necessarily achieved due to precision loss.
logp: 267.25686032200514 -> 703.1723809089232
Now that we have the model set up, let’s plot the maximum a posteriori model prediction.
plt.plot(x, y, "k", label="data")
plt.plot(x, map_soln["pred"], color="C1", label="model")
plt.xlim(x.min(), x.max())
plt.legend(fontsize=10)
plt.xlabel("time [days]")
plt.ylabel("relative flux [ppt]")
plt.title("KIC 5809890; map model");
That looks pretty good! Now let’s sample from the posterior using
exoplanet.get_dense_nuts_step()
.
np.random.seed(42)
with model:
trace = pm.sample(
tune=2000,
draws=2000,
start=map_soln,
step=xo.get_dense_nuts_step(target_accept=0.9),
)
Multiprocess sampling (4 chains in 4 jobs) NUTS: [mix, logdeltaQ, logQ0, logperiod, logamp, logw0, logSw4, logs2, mean] Sampling 4 chains: 100%|██████████| 16000/16000 [06:24<00:00, 41.60draws/s] There were 2 divergences after tuning. Increase target_accept or reparameterize. There were 2 divergences after tuning. Increase target_accept or reparameterize. There were 5 divergences after tuning. Increase target_accept or reparameterize. The number of effective samples is smaller than 25% for some parameters.
Now we can do the usual convergence checks:
pm.summary(
trace,
varnames=[
"mix",
"logdeltaQ",
"logQ0",
"logperiod",
"logamp",
"logSw4",
"logw0",
"logs2",
"mean",
],
)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
mix | 0.638907 | 0.246249 | 0.005999 | 0.178848 | 0.999876 | 1388.904252 | 1.000946 |
logdeltaQ | 0.622804 | 9.218069 | 0.181064 | -17.817309 | 19.667857 | 2368.845797 | 1.000144 |
logQ0 | 0.781421 | 0.536186 | 0.007963 | -0.247969 | 1.821605 | 3670.677635 | 0.999928 |
logperiod | 3.326002 | 0.096711 | 0.002497 | 3.145160 | 3.528834 | 1366.843586 | 1.000468 |
logamp | 0.384757 | 0.606889 | 0.013657 | -0.588614 | 1.632578 | 1373.325178 | 1.000085 |
logSw4 | 6.610959 | 0.747819 | 0.012153 | 5.128654 | 8.023461 | 3602.498378 | 1.000397 |
logw0 | 3.861721 | 0.195403 | 0.002990 | 3.478037 | 4.236481 | 4046.885861 | 1.000415 |
logs2 | -6.207327 | 0.682157 | 0.014721 | -7.640965 | -5.205967 | 2293.592072 | 1.000621 |
mean | -0.014214 | 0.202787 | 0.002961 | -0.418842 | 0.384741 | 4635.502757 | 0.999902 |
And plot the posterior distribution over rotation period:
period_samples = trace["period"]
bins = np.linspace(20, 45, 40)
plt.hist(period_samples, bins, histtype="step", color="k")
plt.yticks([])
plt.xlim(bins.min(), bins.max())
plt.xlabel("rotation period [days]")
plt.ylabel("posterior density");
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:exoplanet, exoplanet:foremanmackey17, exoplanet:foremanmackey18, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
In this tutorial, we will combine many of the previous tutorials to perform a fit of the K2-24 system using the K2 transit data and the RVs from Petigura et al. (2016). This is the same system that we fit in the Radial velocity fitting tutorial and we’ll combine that model with the transit model from the Transit fitting tutorial and the Gaussian Process noise model from the Gaussian process models for stellar variability tutorial.
To get started, let’s download the relevant datasets. First, the transit light curve from Everest:
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from scipy.signal import savgol_filter
# Download the data
lc_url = "https://archive.stsci.edu/hlsps/everest/v2/c02/203700000/71098/hlsp_everest_k2_llc_203771098-c02_kepler_v2.0_lc.fits"
with fits.open(lc_url) as hdus:
lc = hdus[1].data
lc_hdr = hdus[1].header
# Work out the exposure time
texp = lc_hdr["FRAMETIM"] * lc_hdr["NUM_FRM"]
texp /= 60.0 * 60.0 * 24.0
# Mask bad data
m = (np.arange(len(lc)) > 100) & np.isfinite(lc["FLUX"]) & np.isfinite(lc["TIME"])
bad_bits = [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 16, 17]
qual = lc["QUALITY"]
for b in bad_bits:
m &= qual & 2 ** (b - 1) == 0
# Convert to parts per thousand
x = lc["TIME"][m]
y = lc["FLUX"][m]
mu = np.median(y)
y = (y / mu - 1) * 1e3
# Identify outliers
m = np.ones(len(y), dtype=bool)
for i in range(10):
y_prime = np.interp(x, x[m], y[m])
smooth = savgol_filter(y_prime, 101, polyorder=3)
resid = y - smooth
sigma = np.sqrt(np.mean(resid ** 2))
m0 = np.abs(resid) < 3 * sigma
if m.sum() == m0.sum():
m = m0
break
m = m0
# Only discard positive outliers
m = resid < 3 * sigma
# Shift the data so that the K2 data start at t=0. This tends to make the fit
# better behaved since t0 covaries with period.
x_ref = np.min(x[m])
x -= x_ref
# Plot the data
plt.plot(x, y, "k", label="data")
plt.plot(x, smooth)
plt.plot(x[~m], y[~m], "xr", label="outliers")
plt.legend(fontsize=12)
plt.xlim(x.min(), x.max())
plt.xlabel("time")
plt.ylabel("flux")
# Make sure that the data type is consistent
x = np.ascontiguousarray(x[m], dtype=np.float64)
y = np.ascontiguousarray(y[m], dtype=np.float64)
smooth = np.ascontiguousarray(smooth[m], dtype=np.float64)
Then the RVs from RadVel:
import pandas as pd
url = "https://raw.githubusercontent.com/California-Planet-Search/radvel/master/example_data/epic203771098.csv"
data = pd.read_csv(url, index_col=0)
# Don't forget to remove the time offset from above!
x_rv = np.array(data.t) - x_ref
y_rv = np.array(data.vel)
yerr_rv = np.array(data.errvel)
plt.errorbar(x_rv, y_rv, yerr=yerr_rv, fmt=".k")
plt.xlabel("time [days]")
plt.ylabel("radial velocity [m/s]");
We can initialize the transit parameters using the box least squares periodogram from AstroPy. (Note: you’ll need AstroPy v3.1 or more recent to use this feature.) A full discussion of transit detection and vetting is beyond the scope of this tutorial so let’s assume that we know that there are two periodic transiting planets in this dataset.
from astropy.timeseries import BoxLeastSquares
m = np.zeros(len(x), dtype=bool)
period_grid = np.exp(np.linspace(np.log(5), np.log(50), 50000))
bls_results = []
periods = []
t0s = []
depths = []
# Compute the periodogram for each planet by iteratively masking out
# transits from the higher signal to noise planets. Here we're assuming
# that we know that there are exactly two planets.
for i in range(2):
bls = BoxLeastSquares(x[~m], y[~m] - smooth[~m])
bls_power = bls.power(period_grid, 0.1, oversample=20)
bls_results.append(bls_power)
# Save the highest peak as the planet candidate
index = np.argmax(bls_power.power)
periods.append(bls_power.period[index])
t0s.append(bls_power.transit_time[index])
depths.append(bls_power.depth[index])
# Mask the data points that are in transit for this candidate
m |= bls.transit_mask(x, periods[-1], 0.5, t0s[-1])
Let’s plot the initial transit estimates based on these periodograms:
fig, axes = plt.subplots(len(bls_results), 2, figsize=(15, 10))
for i in range(len(bls_results)):
# Plot the periodogram
ax = axes[i, 0]
ax.axvline(np.log10(periods[i]), color="C1", lw=5, alpha=0.8)
ax.plot(np.log10(bls_results[i].period), bls_results[i].power, "k")
ax.annotate(
"period = {0:.4f} d".format(periods[i]),
(0, 1),
xycoords="axes fraction",
xytext=(5, -5),
textcoords="offset points",
va="top",
ha="left",
fontsize=12,
)
ax.set_ylabel("bls power")
ax.set_yticks([])
ax.set_xlim(np.log10(period_grid.min()), np.log10(period_grid.max()))
if i < len(bls_results) - 1:
ax.set_xticklabels([])
else:
ax.set_xlabel("log10(period)")
# Plot the folded transit
ax = axes[i, 1]
p = periods[i]
x_fold = (x - t0s[i] + 0.5 * p) % p - 0.5 * p
m = np.abs(x_fold) < 0.4
ax.plot(x_fold[m], y[m] - smooth[m], ".k")
# Overplot the phase binned light curve
bins = np.linspace(-0.41, 0.41, 32)
denom, _ = np.histogram(x_fold, bins)
num, _ = np.histogram(x_fold, bins, weights=y - smooth)
denom[num == 0] = 1.0
ax.plot(0.5 * (bins[1:] + bins[:-1]), num / denom, color="C1")
ax.set_xlim(-0.4, 0.4)
ax.set_ylabel("relative flux [ppt]")
if i < len(bls_results) - 1:
ax.set_xticklabels([])
else:
ax.set_xlabel("time since transit")
fig.subplots_adjust(hspace=0.02)
The discovery paper for K2-24 (Petigura et al. (2016)) includes the following estimates of the stellar mass and radius in Solar units:
M_star_petigura = 1.12, 0.05
R_star_petigura = 1.21, 0.11
Finally, using this stellar mass, we can also estimate the minimum masses of the planets given these transit parameters.
import exoplanet as xo
import astropy.units as u
msini = xo.estimate_minimum_mass(
periods, x_rv, y_rv, yerr_rv, t0s=t0s, m_star=M_star_petigura[0]
)
msini = msini.to(u.M_earth)
print(msini)
[32.80060146 23.89885976] earthMass
Now, let’s define our full model in PyMC3. There’s a lot going on here, but I’ve tried to comment it and most of it should be familiar from the previous tutorials (Radial velocity fitting, Transit fitting, Scalable Gaussian processes in PyMC3, and Gaussian process models for stellar variability). In this case, I’ve put the model inside a model “factory” function because we’ll do some sigma clipping below.
import pymc3 as pm
import theano.tensor as tt
t_rv = np.linspace(x_rv.min() - 5, x_rv.max() + 5, 1000)
def build_model(mask=None, start=None):
if mask is None:
mask = np.ones(len(x), dtype=bool)
with pm.Model() as model:
# Parameters for the stellar properties
mean = pm.Normal("mean", mu=0.0, sd=10.0)
u_star = xo.distributions.QuadLimbDark("u_star")
BoundedNormal = pm.Bound(pm.Normal, lower=0, upper=3)
m_star = BoundedNormal("m_star", mu=M_star_petigura[0], sd=M_star_petigura[1])
r_star = BoundedNormal("r_star", mu=R_star_petigura[0], sd=R_star_petigura[1])
# Orbital parameters for the planets
logm = pm.Normal("logm", mu=np.log(msini.value), sd=1, shape=2)
logP = pm.Normal("logP", mu=np.log(periods), sd=1, shape=2)
t0 = pm.Normal("t0", mu=np.array(t0s), sd=1, shape=2)
logr = pm.Normal(
"logr",
mu=0.5 * np.log(1e-3 * np.array(depths)) + np.log(R_star_petigura[0]),
sd=1.0,
shape=2,
)
r_pl = pm.Deterministic("r_pl", tt.exp(logr))
ror = pm.Deterministic("ror", r_pl / r_star)
b = xo.distributions.ImpactParameter("b", ror=ror, shape=2)
ecc = xo.distributions.eccentricity.vaneylen19(
"ecc", multi=True, shape=2, testval=np.array([0.01, 0.01])
)
omega = xo.distributions.Angle("omega", shape=2)
# RV jitter & a quadratic RV trend
logs_rv = pm.Normal("logs_rv", mu=np.log(np.median(yerr_rv)), sd=5)
trend = pm.Normal("trend", mu=0, sd=10.0 ** -np.arange(3)[::-1], shape=3)
# Transit jitter & GP parameters
logs2 = pm.Normal("logs2", mu=np.log(np.var(y[mask])), sd=10)
logw0 = pm.Normal("logw0", mu=0.0, sd=10)
logSw4 = pm.Normal("logSw4", mu=np.log(np.var(y[mask])), sd=10)
# Tracking planet parameters
period = pm.Deterministic("period", tt.exp(logP))
m_pl = pm.Deterministic("m_pl", tt.exp(logm))
# Orbit model
orbit = xo.orbits.KeplerianOrbit(
r_star=r_star,
m_star=m_star,
period=period,
t0=t0,
b=b,
m_planet=xo.units.with_unit(m_pl, msini.unit),
ecc=ecc,
omega=omega,
)
# Compute the model light curve using starry
light_curves = (
xo.LimbDarkLightCurve(u_star).get_light_curve(
orbit=orbit, r=r_pl, t=x[mask], texp=texp
)
* 1e3
)
light_curve = pm.math.sum(light_curves, axis=-1) + mean
pm.Deterministic("light_curves", light_curves)
# GP model for the light curve
kernel = xo.gp.terms.SHOTerm(log_Sw4=logSw4, log_w0=logw0, Q=1 / np.sqrt(2))
gp = xo.gp.GP(kernel, x[mask], tt.exp(logs2) + tt.zeros(mask.sum()))
pm.Potential("transit_obs", gp.log_likelihood(y[mask] - light_curve))
pm.Deterministic("gp_pred", gp.predict())
# Set up the RV model and save it as a deterministic
# for plotting purposes later
vrad = orbit.get_radial_velocity(x_rv)
pm.Deterministic("vrad", vrad)
# Define the background RV model
A = np.vander(x_rv - 0.5 * (x_rv.min() + x_rv.max()), 3)
bkg = pm.Deterministic("bkg", tt.dot(A, trend))
# The likelihood for the RVs
rv_model = pm.Deterministic("rv_model", tt.sum(vrad, axis=-1) + bkg)
err = tt.sqrt(yerr_rv ** 2 + tt.exp(2 * logs_rv))
pm.Normal("obs", mu=rv_model, sd=err, observed=y_rv)
vrad_pred = orbit.get_radial_velocity(t_rv)
pm.Deterministic("vrad_pred", vrad_pred)
A_pred = np.vander(t_rv - 0.5 * (x_rv.min() + x_rv.max()), 3)
bkg_pred = pm.Deterministic("bkg_pred", tt.dot(A_pred, trend))
pm.Deterministic("rv_model_pred", tt.sum(vrad_pred, axis=-1) + bkg_pred)
# Fit for the maximum a posteriori parameters, I've found that I can get
# a better solution by trying different combinations of parameters in turn
if start is None:
start = model.test_point
map_soln = xo.optimize(start=start, vars=[trend])
map_soln = xo.optimize(start=map_soln, vars=[logs2])
map_soln = xo.optimize(start=map_soln, vars=[logr, b])
map_soln = xo.optimize(start=map_soln, vars=[logP, t0])
map_soln = xo.optimize(start=map_soln, vars=[logs2, logSw4])
map_soln = xo.optimize(start=map_soln, vars=[logw0])
map_soln = xo.optimize(start=map_soln)
return model, map_soln
model0, map_soln0 = build_model()
optimizing logp for variables: [trend]
12it [00:11, 1.08it/s, logp=-8.276752e+03]
message: Optimization terminated successfully.
logp: -8290.066560739391 -> -8276.751877588842
optimizing logp for variables: [logs2]
15it [00:01, 9.52it/s, logp=2.110025e+03]
message: Optimization terminated successfully.
logp: -8276.751877588842 -> 2110.02456128196
optimizing logp for variables: [b, logr, r_star]
105it [00:02, 44.46it/s, logp=2.689040e+03]
message: Desired error not necessarily achieved due to precision loss.
logp: 2110.02456128196 -> 2689.0397283218103
optimizing logp for variables: [t0, logP]
79it [00:01, 42.17it/s, logp=3.244872e+03]
message: Desired error not necessarily achieved due to precision loss.
logp: 2689.039728321822 -> 3244.8715047912565
optimizing logp for variables: [logSw4, logs2]
13it [00:01, 7.02it/s, logp=3.899740e+03]
message: Optimization terminated successfully.
logp: 3244.8715047912547 -> 3899.7404342927084
optimizing logp for variables: [logw0]
12it [00:01, 6.70it/s, logp=4.004137e+03]
message: Optimization terminated successfully.
logp: 3899.7404342927084 -> 4004.1373334508944
optimizing logp for variables: [logSw4, logw0, logs2, trend, logs_rv, omega, ecc_frac, ecc_sigma_rayleigh, ecc_sigma_gauss, ecc, b, logr, t0, logP, logm, r_star, m_star, u_star, mean]
339it [00:03, 89.50it/s, logp=4.775494e+03]
message: Desired error not necessarily achieved due to precision loss.
logp: 4004.137333450869 -> 4775.494114320922
Now let’s plot the map radial velocity model.
def plot_rv_curve(soln):
fig, axes = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
ax = axes[0]
ax.errorbar(x_rv, y_rv, yerr=yerr_rv, fmt=".k")
ax.plot(t_rv, soln["vrad_pred"], "--k", alpha=0.5)
ax.plot(t_rv, soln["bkg_pred"], ":k", alpha=0.5)
ax.plot(t_rv, soln["rv_model_pred"], label="model")
ax.legend(fontsize=10)
ax.set_ylabel("radial velocity [m/s]")
ax = axes[1]
err = np.sqrt(yerr_rv ** 2 + np.exp(2 * soln["logs_rv"]))
ax.errorbar(x_rv, y_rv - soln["rv_model"], yerr=err, fmt=".k")
ax.axhline(0, color="k", lw=1)
ax.set_ylabel("residuals [m/s]")
ax.set_xlim(t_rv.min(), t_rv.max())
ax.set_xlabel("time [days]")
plot_rv_curve(map_soln0)
That looks pretty similar to what we got in Radial velocity fitting. Now let’s also plot the transit model.
def plot_light_curve(soln, mask=None):
if mask is None:
mask = np.ones(len(x), dtype=bool)
fig, axes = plt.subplots(3, 1, figsize=(10, 7), sharex=True)
ax = axes[0]
ax.plot(x[mask], y[mask], "k", label="data")
gp_mod = soln["gp_pred"] + soln["mean"]
ax.plot(x[mask], gp_mod, color="C2", label="gp model")
ax.legend(fontsize=10)
ax.set_ylabel("relative flux [ppt]")
ax = axes[1]
ax.plot(x[mask], y[mask] - gp_mod, "k", label="de-trended data")
for i, l in enumerate("bc"):
mod = soln["light_curves"][:, i]
ax.plot(x[mask], mod, label="planet {0}".format(l))
ax.legend(fontsize=10, loc=3)
ax.set_ylabel("de-trended flux [ppt]")
ax = axes[2]
mod = gp_mod + np.sum(soln["light_curves"], axis=-1)
ax.plot(x[mask], y[mask] - mod, "k")
ax.axhline(0, color="#aaaaaa", lw=1)
ax.set_ylabel("residuals [ppt]")
ax.set_xlim(x[mask].min(), x[mask].max())
ax.set_xlabel("time [days]")
return fig
plot_light_curve(map_soln0);
There are still a few outliers in the light curve and it can be useful to remove those before doing the full fit because both the GP and transit parameters can be sensitive to this.
To remove the outliers, we’ll look at the empirical RMS of the residuals away from the GP + transit model and remove anything that is more than a 7-sigma outlier.
mod = (
map_soln0["gp_pred"]
+ map_soln0["mean"]
+ np.sum(map_soln0["light_curves"], axis=-1)
)
resid = y - mod
rms = np.sqrt(np.median(resid ** 2))
mask = np.abs(resid) < 7 * rms
plt.plot(x, resid, "k", label="data")
plt.plot(x[~mask], resid[~mask], "xr", label="outliers")
plt.axhline(0, color="#aaaaaa", lw=1)
plt.ylabel("residuals [ppt]")
plt.xlabel("time [days]")
plt.legend(fontsize=12, loc=4)
plt.xlim(x.min(), x.max());
That looks better. Let’s re-build our model with this sigma-clipped dataset.
model, map_soln = build_model(mask, map_soln0)
plot_light_curve(map_soln, mask);
optimizing logp for variables: [trend]
5it [00:01, 3.03it/s, logp=5.225381e+03]
message: Optimization terminated successfully.
logp: 5225.381218190808 -> 5225.381218190808
optimizing logp for variables: [logs2]
9it [00:01, 6.40it/s, logp=5.307429e+03]
message: Optimization terminated successfully.
logp: 5225.381218190808 -> 5307.42938442338
optimizing logp for variables: [b, logr, r_star]
26it [00:01, 14.55it/s, logp=5.318396e+03]
message: Optimization terminated successfully.
logp: 5307.42938442338 -> 5318.395979595314
optimizing logp for variables: [t0, logP]
165it [00:02, 67.34it/s, logp=5.319775e+03]
message: Desired error not necessarily achieved due to precision loss.
logp: 5318.395979595323 -> 5319.775397150623
optimizing logp for variables: [logSw4, logs2]
10it [00:01, 6.97it/s, logp=5.320504e+03]
message: Optimization terminated successfully.
logp: 5319.77539715063 -> 5320.5042723781
optimizing logp for variables: [logw0]
9it [00:01, 6.15it/s, logp=5.320538e+03]
message: Optimization terminated successfully.
logp: 5320.5042723781 -> 5320.538333858067
optimizing logp for variables: [logSw4, logw0, logs2, trend, logs_rv, omega, ecc_frac, ecc_sigma_rayleigh, ecc_sigma_gauss, ecc, b, logr, t0, logP, logm, r_star, m_star, u_star, mean]
141it [00:01, 73.40it/s, logp=5.322099e+03]
message: Desired error not necessarily achieved due to precision loss.
logp: 5320.538333858038 -> 5322.099130377621
Great! Now we’re ready to sample.
The sampling for this model is the same as for all the previous tutorials, but it takes a bit longer (about 2 hours on my laptop). This is partly because the model is more expensive to compute than the previous ones and partly because there are some non-affine degeneracies in the problem (for example between impact parameter and eccentricity). It might be worth thinking about reparameterizations (in terms of duration instead of eccentricity), but that’s beyond the scope of this tutorial. Besides, using more traditional MCMC methods, this would have taken a lot more than 2 hours to get >1000 effective samples!
np.random.seed(123)
with model:
trace = pm.sample(
tune=6000,
draws=3000,
start=map_soln,
chains=4,
step=xo.get_dense_nuts_step(target_accept=0.9),
)
Multiprocess sampling (4 chains in 4 jobs) NUTS: [logSw4, logw0, logs2, trend, logs_rv, omega, ecc_frac, ecc_sigma_rayleigh, ecc_sigma_gauss, ecc, b, logr, t0, logP, logm, r_star, m_star, u_star, mean] Sampling 4 chains: 100%|██████████| 36000/36000 [1:10:35<00:00, 1.90draws/s] There were 81 divergences after tuning. Increase target_accept or reparameterize. There were 50 divergences after tuning. Increase target_accept or reparameterize. There were 46 divergences after tuning. Increase target_accept or reparameterize. There were 58 divergences after tuning. Increase target_accept or reparameterize. The number of effective samples is smaller than 10% for some parameters.
Let’s look at the convergence diagnostics for some of the key parameters:
pm.summary(trace, varnames=["period", "r_pl", "m_pl", "ecc", "omega", "b"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
period__0 | 42.363456 | 0.000396 | 0.000004 | 42.362689 | 42.364234 | 10642.409433 | 0.999895 |
period__1 | 20.885338 | 0.000202 | 0.000002 | 20.884931 | 20.885721 | 8645.015917 | 1.000072 |
r_pl__0 | 0.077521 | 0.004554 | 0.000111 | 0.068177 | 0.085870 | 1941.768212 | 1.000340 |
r_pl__1 | 0.055888 | 0.003533 | 0.000094 | 0.048931 | 0.062613 | 1647.351091 | 1.000157 |
m_pl__0 | 27.223719 | 6.014221 | 0.065348 | 15.693746 | 39.294948 | 6928.785328 | 1.000213 |
m_pl__1 | 22.130263 | 4.413712 | 0.047499 | 13.497093 | 31.012766 | 6670.330514 | 1.000238 |
ecc__0 | 0.039703 | 0.034472 | 0.000653 | 0.000002 | 0.107371 | 2786.435033 | 1.000697 |
ecc__1 | 0.107645 | 0.093834 | 0.002300 | 0.000002 | 0.293460 | 1503.452582 | 1.000982 |
omega__0 | 0.143650 | 2.014668 | 0.035205 | -3.134686 | 2.927897 | 2847.120109 | 1.000033 |
omega__1 | -0.347757 | 1.034447 | 0.021766 | -2.506963 | 2.330928 | 2172.343269 | 1.000205 |
b__0 | 0.588049 | 0.050939 | 0.001340 | 0.485199 | 0.669322 | 1440.172010 | 1.001588 |
b__1 | 0.538854 | 0.111060 | 0.003621 | 0.311607 | 0.726459 | 1048.071567 | 1.000011 |
As you see, the effective number of samples for the impact parameters and eccentricites are lower than for the other parameters. This is because of the correlations that I mentioned above:
import corner
varnames = ["b", "ecc"]
samples = pm.trace_to_dataframe(trace, varnames=varnames)
fig = corner.corner(samples);
Finally, as in the Radial velocity fitting and Transit fitting tutorials, we can make folded plots of the transits and the radial velocities and compare to the posterior model predictions. (Note: planets b and c in this tutorial are swapped compared to the labels from Petigura et al. (2016))
for n, letter in enumerate("bc"):
plt.figure()
# Compute the GP prediction
gp_mod = np.median(trace["gp_pred"] + trace["mean"][:, None], axis=0)
# Get the posterior median orbital parameters
p = np.median(trace["period"][:, n])
t0 = np.median(trace["t0"][:, n])
# Compute the median of posterior estimate of the contribution from
# the other planet. Then we can remove this from the data to plot
# just the planet we care about.
other = np.median(trace["light_curves"][:, :, (n + 1) % 2], axis=0)
# Plot the folded data
x_fold = (x[mask] - t0 + 0.5 * p) % p - 0.5 * p
plt.plot(x_fold, y[mask] - gp_mod - other, ".k", label="data", zorder=-1000)
# Plot the folded model
inds = np.argsort(x_fold)
inds = inds[np.abs(x_fold)[inds] < 0.3]
pred = trace["light_curves"][:, inds, n]
pred = np.percentile(pred, [16, 50, 84], axis=0)
plt.plot(x_fold[inds], pred[1], color="C1", label="model")
art = plt.fill_between(
x_fold[inds], pred[0], pred[2], color="C1", alpha=0.5, zorder=1000
)
art.set_edgecolor("none")
# Annotate the plot with the planet's period
txt = "period = {0:.4f} +/- {1:.4f} d".format(
np.mean(trace["period"][:, n]), np.std(trace["period"][:, n])
)
plt.annotate(
txt,
(0, 0),
xycoords="axes fraction",
xytext=(5, 5),
textcoords="offset points",
ha="left",
va="bottom",
fontsize=12,
)
plt.legend(fontsize=10, loc=4)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("time since transit [days]")
plt.ylabel("de-trended flux")
plt.title("K2-24{0}".format(letter))
plt.xlim(-0.3, 0.3)
for n, letter in enumerate("bc"):
plt.figure()
# Get the posterior median orbital parameters
p = np.median(trace["period"][:, n])
t0 = np.median(trace["t0"][:, n])
# Compute the median of posterior estimate of the background RV
# and the contribution from the other planet. Then we can remove
# this from the data to plot just the planet we care about.
other = np.median(trace["vrad"][:, :, (n + 1) % 2], axis=0)
other += np.median(trace["bkg"], axis=0)
# Plot the folded data
x_fold = (x_rv - t0 + 0.5 * p) % p - 0.5 * p
plt.errorbar(x_fold, y_rv - other, yerr=yerr_rv, fmt=".k", label="data")
# Compute the posterior prediction for the folded RV model for this
# planet
t_fold = (t_rv - t0 + 0.5 * p) % p - 0.5 * p
inds = np.argsort(t_fold)
pred = np.percentile(trace["vrad_pred"][:, inds, n], [16, 50, 84], axis=0)
plt.plot(t_fold[inds], pred[1], color="C1", label="model")
art = plt.fill_between(t_fold[inds], pred[0], pred[2], color="C1", alpha=0.3)
art.set_edgecolor("none")
plt.legend(fontsize=10)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("phase [days]")
plt.ylabel("radial velocity [m/s]")
plt.title("K2-24{0}".format(letter));
We can also compute the posterior constraints on the planet densities.
volume = 4 / 3 * np.pi * trace["r_pl"] ** 3
density = u.Quantity(trace["m_pl"] / volume, unit=u.M_earth / u.R_sun ** 3)
density = density.to(u.g / u.cm ** 3).value
bins = np.linspace(0, 1.1, 45)
for n, letter in enumerate("bc"):
plt.hist(
density[:, n],
bins,
histtype="step",
lw=2,
label="K2-24{0}".format(letter),
density=True,
)
plt.yticks([])
plt.legend(fontsize=12)
plt.xlim(bins[0], bins[-1])
plt.xlabel("density [g/cc]")
plt.ylabel("posterior density");
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:agol19, exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:foremanmackey17, exoplanet:foremanmackey18, exoplanet:kipping13, exoplanet:luger18, exoplanet:pymc3, exoplanet:theano, exoplanet:vaneylen19}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
In this tutorial, we will reproduce the fits to the transiting planet in the Pi Mensae system discovered by Huang et al. (2018). The data processing and model are similar to the Case study: K2-24, putting it all together tutorial, but with a few extra bits like aperture selection and de-trending.
To start, we need to download the target pixel file:
import numpy as np
import lightkurve as lk
import matplotlib.pyplot as plt
tpf_file = lk.search_targetpixelfile("TIC 261136679", sector=1).download()
with tpf_file.hdu as hdu:
tpf = hdu[1].data
tpf_hdr = hdu[1].header
texp = tpf_hdr["FRAMETIM"] * tpf_hdr["NUM_FRM"]
texp /= 60.0 * 60.0 * 24.0
time = tpf["TIME"]
flux = tpf["FLUX"]
m = np.any(np.isfinite(flux), axis=(1, 2)) & (tpf["QUALITY"] == 0)
ref_time = 0.5 * (np.min(time[m]) + np.max(time[m]))
time = np.ascontiguousarray(time[m] - ref_time, dtype=np.float64)
flux = np.ascontiguousarray(flux[m], dtype=np.float64)
mean_img = np.median(flux, axis=0)
plt.imshow(mean_img.T, cmap="gray_r")
plt.title("TESS image of Pi Men")
plt.xticks([])
plt.yticks([]);
Next, we’ll select an aperture using a hacky method that tries to minimizes the windowed scatter in the lightcurve (something like the CDPP).
from scipy.signal import savgol_filter
# Sort the pixels by median brightness
order = np.argsort(mean_img.flatten())[::-1]
# A function to estimate the windowed scatter in a lightcurve
def estimate_scatter_with_mask(mask):
f = np.sum(flux[:, mask], axis=-1)
smooth = savgol_filter(f, 1001, polyorder=5)
return 1e6 * np.sqrt(np.median((f / smooth - 1) ** 2))
# Loop over pixels ordered by brightness and add them one-by-one
# to the aperture
masks, scatters = [], []
for i in range(10, 100):
msk = np.zeros_like(mean_img, dtype=bool)
msk[np.unravel_index(order[:i], mean_img.shape)] = True
scatter = estimate_scatter_with_mask(msk)
masks.append(msk)
scatters.append(scatter)
# Choose the aperture that minimizes the scatter
pix_mask = masks[np.argmin(scatters)]
# Plot the selected aperture
plt.imshow(mean_img.T, cmap="gray_r")
plt.imshow(pix_mask.T, cmap="Reds", alpha=0.3)
plt.title("selected aperture")
plt.xticks([])
plt.yticks([]);
This aperture produces the following light curve:
plt.figure(figsize=(10, 5))
sap_flux = np.sum(flux[:, pix_mask], axis=-1)
sap_flux = (sap_flux / np.median(sap_flux) - 1) * 1e3
plt.plot(time, sap_flux, "k")
plt.xlabel("time [days]")
plt.ylabel("relative flux [ppt]")
plt.title("raw light curve")
plt.xlim(time.min(), time.max());
This doesn’t look terrible, but we’re still going to want to de-trend it a little bit. We’ll use “pixel-level deconvolution” (PLD) to de-trend following the method used by Everest. Specifically, we’ll use first order PLD plus the top few PCA components of the second order PLD basis.
# Build the first order PLD basis
X_pld = np.reshape(flux[:, pix_mask], (len(flux), -1))
X_pld = X_pld / np.sum(flux[:, pix_mask], axis=-1)[:, None]
# Build the second order PLD basis and run PCA to reduce the number of dimensions
X2_pld = np.reshape(X_pld[:, None, :] * X_pld[:, :, None], (len(flux), -1))
U, _, _ = np.linalg.svd(X2_pld, full_matrices=False)
X2_pld = U[:, : X_pld.shape[1]]
# Construct the design matrix and fit for the PLD model
X_pld = np.concatenate((np.ones((len(flux), 1)), X_pld, X2_pld), axis=-1)
XTX = np.dot(X_pld.T, X_pld)
w_pld = np.linalg.solve(XTX, np.dot(X_pld.T, sap_flux))
pld_flux = np.dot(X_pld, w_pld)
# Plot the de-trended light curve
plt.figure(figsize=(10, 5))
plt.plot(time, sap_flux - pld_flux, "k")
plt.xlabel("time [days]")
plt.ylabel("de-trended flux [ppt]")
plt.title("initial de-trended light curve")
plt.xlim(time.min(), time.max());
That looks better.
Now, let’s use the box least squares periodogram from AstroPy (Note: you’ll need AstroPy v3.1 or more recent to use this feature) to estimate the period, phase, and depth of the transit.
from astropy.timeseries import BoxLeastSquares
period_grid = np.exp(np.linspace(np.log(1), np.log(15), 50000))
bls = BoxLeastSquares(time, sap_flux - pld_flux)
bls_power = bls.power(period_grid, 0.1, oversample=20)
# Save the highest peak as the planet candidate
index = np.argmax(bls_power.power)
bls_period = bls_power.period[index]
bls_t0 = bls_power.transit_time[index]
bls_depth = bls_power.depth[index]
transit_mask = bls.transit_mask(time, bls_period, 0.2, bls_t0)
fig, axes = plt.subplots(2, 1, figsize=(10, 10))
# Plot the periodogram
ax = axes[0]
ax.axvline(np.log10(bls_period), color="C1", lw=5, alpha=0.8)
ax.plot(np.log10(bls_power.period), bls_power.power, "k")
ax.annotate(
"period = {0:.4f} d".format(bls_period),
(0, 1),
xycoords="axes fraction",
xytext=(5, -5),
textcoords="offset points",
va="top",
ha="left",
fontsize=12,
)
ax.set_ylabel("bls power")
ax.set_yticks([])
ax.set_xlim(np.log10(period_grid.min()), np.log10(period_grid.max()))
ax.set_xlabel("log10(period)")
# Plot the folded transit
ax = axes[1]
x_fold = (time - bls_t0 + 0.5 * bls_period) % bls_period - 0.5 * bls_period
m = np.abs(x_fold) < 0.4
ax.plot(x_fold[m], sap_flux[m] - pld_flux[m], ".k")
# Overplot the phase binned light curve
bins = np.linspace(-0.41, 0.41, 32)
denom, _ = np.histogram(x_fold, bins)
num, _ = np.histogram(x_fold, bins, weights=sap_flux - pld_flux)
denom[num == 0] = 1.0
ax.plot(0.5 * (bins[1:] + bins[:-1]), num / denom, color="C1")
ax.set_xlim(-0.3, 0.3)
ax.set_ylabel("de-trended flux [ppt]")
ax.set_xlabel("time since transit");
Now that we know where the transits are, it’s generally good practice to de-trend the data one more time with the transits masked so that the de-trending doesn’t overfit the transits. Let’s do that.
m = ~transit_mask
XTX = np.dot(X_pld[m].T, X_pld[m])
w_pld = np.linalg.solve(XTX, np.dot(X_pld[m].T, sap_flux[m]))
pld_flux = np.dot(X_pld, w_pld)
x = np.ascontiguousarray(time, dtype=np.float64)
y = np.ascontiguousarray(sap_flux - pld_flux, dtype=np.float64)
plt.figure(figsize=(10, 5))
plt.plot(time, y, "k")
plt.xlabel("time [days]")
plt.ylabel("de-trended flux [ppt]")
plt.title("final de-trended light curve")
plt.xlim(time.min(), time.max());
To confirm that we didn’t overfit the transit, we can look at the folded light curve for the PLD model near trasit. This shouldn’t have any residual transit signal, and that looks correct here:
plt.figure(figsize=(10, 5))
x_fold = (x - bls_t0 + 0.5 * bls_period) % bls_period - 0.5 * bls_period
m = np.abs(x_fold) < 0.3
plt.plot(x_fold[m], pld_flux[m], ".k", ms=4)
bins = np.linspace(-0.5, 0.5, 60)
denom, _ = np.histogram(x_fold, bins)
num, _ = np.histogram(x_fold, bins, weights=pld_flux)
denom[num == 0] = 1.0
plt.plot(0.5 * (bins[1:] + bins[:-1]), num / denom, color="C1", lw=2)
plt.xlim(-0.2, 0.2)
plt.xlabel("time since transit")
plt.ylabel("PLD model flux");
The transit model, initialization, and sampling are all nearly the same as the one in Case study: K2-24, putting it all together.
import exoplanet as xo
import pymc3 as pm
import theano.tensor as tt
def build_model(mask=None, start=None):
if mask is None:
mask = np.ones(len(x), dtype=bool)
with pm.Model() as model:
# Parameters for the stellar properties
mean = pm.Normal("mean", mu=0.0, sd=10.0)
u_star = xo.distributions.QuadLimbDark("u_star")
# Stellar parameters from Huang et al (2018)
M_star_huang = 1.094, 0.039
R_star_huang = 1.10, 0.023
BoundedNormal = pm.Bound(pm.Normal, lower=0, upper=3)
m_star = BoundedNormal("m_star", mu=M_star_huang[0], sd=M_star_huang[1])
r_star = BoundedNormal("r_star", mu=R_star_huang[0], sd=R_star_huang[1])
# Orbital parameters for the planets
logP = pm.Normal("logP", mu=np.log(bls_period), sd=1)
t0 = pm.Normal("t0", mu=bls_t0, sd=1)
logr = pm.Normal(
"logr",
sd=1.0,
mu=0.5 * np.log(1e-3 * np.array(bls_depth)) + np.log(R_star_huang[0]),
)
r_pl = pm.Deterministic("r_pl", tt.exp(logr))
ror = pm.Deterministic("ror", r_pl / r_star)
b = xo.distributions.ImpactParameter("b", ror=ror)
# This is the eccentricity prior from Kipping (2013):
# https://arxiv.org/abs/1306.4982
ecc = xo.distributions.eccentricity.kipping13("ecc", testval=0.1)
omega = xo.distributions.Angle("omega")
# Transit jitter & GP parameters
logs2 = pm.Normal("logs2", mu=np.log(np.var(y[mask])), sd=10)
logw0 = pm.Normal("logw0", mu=0, sd=10)
logSw4 = pm.Normal("logSw4", mu=np.log(np.var(y[mask])), sd=10)
# Tracking planet parameters
period = pm.Deterministic("period", tt.exp(logP))
# Orbit model
orbit = xo.orbits.KeplerianOrbit(
r_star=r_star,
m_star=m_star,
period=period,
t0=t0,
b=b,
ecc=ecc,
omega=omega,
)
# Compute the model light curve using starry
light_curves = (
xo.LimbDarkLightCurve(u_star).get_light_curve(
orbit=orbit, r=r_pl, t=x[mask], texp=texp
)
* 1e3
)
light_curve = pm.math.sum(light_curves, axis=-1) + mean
pm.Deterministic("light_curves", light_curves)
# GP model for the light curve
kernel = xo.gp.terms.SHOTerm(log_Sw4=logSw4, log_w0=logw0, Q=1 / np.sqrt(2))
gp = xo.gp.GP(kernel, x[mask], tt.exp(logs2) + tt.zeros(mask.sum()), J=2)
pm.Potential("transit_obs", gp.log_likelihood(y[mask] - light_curve))
pm.Deterministic("gp_pred", gp.predict())
# Fit for the maximum a posteriori parameters, I've found that I can get
# a better solution by trying different combinations of parameters in turn
if start is None:
start = model.test_point
map_soln = xo.optimize(start=start, vars=[logs2, logSw4, logw0])
map_soln = xo.optimize(start=map_soln, vars=[logr])
map_soln = xo.optimize(start=map_soln, vars=[b])
map_soln = xo.optimize(start=map_soln, vars=[logP, t0])
map_soln = xo.optimize(start=map_soln, vars=[u_star])
map_soln = xo.optimize(start=map_soln, vars=[logr])
map_soln = xo.optimize(start=map_soln, vars=[b])
map_soln = xo.optimize(start=map_soln, vars=[ecc, omega])
map_soln = xo.optimize(start=map_soln, vars=[mean])
map_soln = xo.optimize(start=map_soln, vars=[logs2, logSw4, logw0])
map_soln = xo.optimize(start=map_soln)
return model, map_soln
model0, map_soln0 = build_model()
optimizing logp for variables: [logw0, logSw4, logs2]
20it [00:08, 2.25it/s, logp=1.264290e+04]
message: Optimization terminated successfully.
logp: 12406.06185575557 -> 12642.89955925584
optimizing logp for variables: [logr]
207it [00:02, 108.36it/s, logp=1.268195e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 12642.899559255851 -> 12681.951246651677
optimizing logp for variables: [b, logr, r_star]
64it [00:01, 33.19it/s, logp=1.294934e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 12681.951246651677 -> 12949.339148287803
optimizing logp for variables: [t0, logP]
99it [00:02, 49.15it/s, logp=1.296374e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 12949.339148287796 -> 12963.740755133214
optimizing logp for variables: [u_star]
12it [00:01, 7.60it/s, logp=1.296662e+04]
message: Optimization terminated successfully.
logp: 12963.740755133207 -> 12966.624941258444
optimizing logp for variables: [logr]
9it [00:01, 8.05it/s, logp=1.296682e+04]
message: Optimization terminated successfully.
logp: 12966.624941258455 -> 12966.818870393383
optimizing logp for variables: [b, logr, r_star]
16it [00:01, 12.90it/s, logp=1.296694e+04]
message: Optimization terminated successfully.
logp: 12966.818870393383 -> 12966.944430577478
optimizing logp for variables: [omega, ecc]
23it [00:01, 13.05it/s, logp=1.298782e+04]
message: Optimization terminated successfully.
logp: 12966.944430577478 -> 12987.822019225678
optimizing logp for variables: [mean]
5it [00:01, 2.63it/s, logp=1.298785e+04]
message: Optimization terminated successfully.
logp: 12987.822019225663 -> 12987.85030612374
optimizing logp for variables: [logw0, logSw4, logs2]
47it [00:01, 25.26it/s, logp=1.299865e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 12987.85030612374 -> 12998.653500794684
optimizing logp for variables: [logSw4, logw0, logs2, omega, ecc, ecc_beta, ecc_alpha, b, logr, t0, logP, r_star, m_star, u_star, mean]
111it [00:02, 42.58it/s, logp=1.305483e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 12998.653500794699 -> 13054.830649513216
Here’s how we plot the initial light curve model:
def plot_light_curve(soln, mask=None):
if mask is None:
mask = np.ones(len(x), dtype=bool)
fig, axes = plt.subplots(3, 1, figsize=(10, 7), sharex=True)
ax = axes[0]
ax.plot(x[mask], y[mask], "k", label="data")
gp_mod = soln["gp_pred"] + soln["mean"]
ax.plot(x[mask], gp_mod, color="C2", label="gp model")
ax.legend(fontsize=10)
ax.set_ylabel("relative flux [ppt]")
ax = axes[1]
ax.plot(x[mask], y[mask] - gp_mod, "k", label="de-trended data")
for i, l in enumerate("b"):
mod = soln["light_curves"][:, i]
ax.plot(x[mask], mod, label="planet {0}".format(l))
ax.legend(fontsize=10, loc=3)
ax.set_ylabel("de-trended flux [ppt]")
ax = axes[2]
mod = gp_mod + np.sum(soln["light_curves"], axis=-1)
ax.plot(x[mask], y[mask] - mod, "k")
ax.axhline(0, color="#aaaaaa", lw=1)
ax.set_ylabel("residuals [ppt]")
ax.set_xlim(x[mask].min(), x[mask].max())
ax.set_xlabel("time [days]")
return fig
plot_light_curve(map_soln0);
As in the Case study: K2-24, putting it all together tutorial, we can do some sigma clipping to remove significant outliers.
mod = (
map_soln0["gp_pred"]
+ map_soln0["mean"]
+ np.sum(map_soln0["light_curves"], axis=-1)
)
resid = y - mod
rms = np.sqrt(np.median(resid ** 2))
mask = np.abs(resid) < 5 * rms
plt.figure(figsize=(10, 5))
plt.plot(x, resid, "k", label="data")
plt.plot(x[~mask], resid[~mask], "xr", label="outliers")
plt.axhline(0, color="#aaaaaa", lw=1)
plt.ylabel("residuals [ppt]")
plt.xlabel("time [days]")
plt.legend(fontsize=12, loc=3)
plt.xlim(x.min(), x.max());
And then we re-build the model using the data without outliers.
model, map_soln = build_model(mask, map_soln0)
plot_light_curve(map_soln, mask);
optimizing logp for variables: [logw0, logSw4, logs2]
15it [00:02, 6.37it/s, logp=1.374030e+04]
message: Optimization terminated successfully.
logp: 13709.337756357809 -> 13740.297129156785
optimizing logp for variables: [logr]
8it [00:01, 5.66it/s, logp=1.374032e+04]
message: Optimization terminated successfully.
logp: 13740.297129156788 -> 13740.31722986988
optimizing logp for variables: [b, logr, r_star]
14it [00:01, 9.95it/s, logp=1.374032e+04]
message: Optimization terminated successfully.
logp: 13740.31722986988 -> 13740.31831713007
optimizing logp for variables: [t0, logP]
146it [00:02, 53.35it/s, logp=1.374033e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 13740.318317130062 -> 13740.326927416274
optimizing logp for variables: [u_star]
8it [00:01, 4.84it/s, logp=1.374035e+04]
message: Optimization terminated successfully.
logp: 13740.326927416267 -> 13740.350391911023
optimizing logp for variables: [logr]
8it [00:01, 6.64it/s, logp=1.374035e+04]
message: Optimization terminated successfully.
logp: 13740.35039191103 -> 13740.353904884088
optimizing logp for variables: [b, logr, r_star]
85it [00:01, 43.90it/s, logp=1.374036e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 13740.353904884088 -> 13740.362784306773
optimizing logp for variables: [omega, ecc]
125it [00:02, 56.46it/s, logp=1.374036e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 13740.362784306773 -> 13740.362835323785
optimizing logp for variables: [mean]
5it [00:01, 4.60it/s, logp=1.374037e+04]
message: Optimization terminated successfully.
logp: 13740.362835323785 -> 13740.366005030273
optimizing logp for variables: [logw0, logSw4, logs2]
9it [00:01, 5.83it/s, logp=1.374037e+04]
message: Optimization terminated successfully.
logp: 13740.366005030273 -> 13740.36600761282
optimizing logp for variables: [logSw4, logw0, logs2, omega, ecc, ecc_beta, ecc_alpha, b, logr, t0, logP, r_star, m_star, u_star, mean]
91it [00:02, 36.68it/s, logp=1.374038e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 13740.366007612805 -> 13740.377939831247
Now that we have the model, we can sample:
np.random.seed(42)
with model:
trace = pm.sample(
tune=5000,
draws=3000,
start=map_soln,
chains=4,
step=xo.get_dense_nuts_step(target_accept=0.9),
)
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [logSw4, logw0, logs2, omega, ecc, ecc_beta, ecc_alpha, b, logr, t0, logP, r_star, m_star, u_star, mean]
Sampling 4 chains: 100%|██████████| 32000/32000 [2:31:00<00:00, 1.07draws/s]
The number of effective samples is smaller than 25% for some parameters.
pm.summary(
trace,
varnames=[
"logw0",
"logSw4",
"logs2",
"omega",
"ecc",
"r_pl",
"b",
"t0",
"logP",
"r_star",
"m_star",
"u_star",
"mean",
],
)
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
logw0 | 1.176557 | 0.135296 | 1.451517e-03 | 0.914931 | 1.443757 | 8688.248464 | 0.999916 |
logSw4 | -2.105827 | 0.317614 | 3.047235e-03 | -2.701237 | -1.464207 | 10667.609961 | 0.999894 |
logs2 | -4.382870 | 0.010754 | 1.055636e-04 | -4.404120 | -4.362122 | 11519.863118 | 0.999949 |
omega | 0.648294 | 1.715356 | 3.512658e-02 | -2.809296 | 3.141469 | 2329.886518 | 1.000512 |
ecc | 0.222420 | 0.143264 | 3.047271e-03 | 0.000121 | 0.509765 | 2381.265132 | 1.001254 |
r_pl | 0.017243 | 0.000648 | 1.334355e-05 | 0.015987 | 0.018522 | 2047.263980 | 1.001190 |
b | 0.390768 | 0.221809 | 5.168899e-03 | 0.000139 | 0.736551 | 1494.738718 | 1.001950 |
t0 | -1.197385 | 0.000680 | 1.596423e-05 | -1.198662 | -1.196055 | 1929.200269 | 1.001265 |
logP | 1.835411 | 0.000069 | 7.772674e-07 | 1.835271 | 1.835534 | 6657.291370 | 1.000026 |
r_star | 1.098913 | 0.022882 | 2.436166e-04 | 1.055207 | 1.144789 | 10624.471452 | 0.999884 |
m_star | 1.095748 | 0.039422 | 3.385010e-04 | 1.017889 | 1.170311 | 11559.600340 | 1.000214 |
u_star__0 | 0.204246 | 0.169537 | 2.327905e-03 | 0.000086 | 0.542679 | 6008.558020 | 1.000173 |
u_star__1 | 0.447711 | 0.275453 | 4.260404e-03 | -0.100902 | 0.936185 | 5088.343252 | 1.000099 |
mean | -0.001239 | 0.008905 | 9.158234e-05 | -0.019270 | 0.015879 | 9237.028735 | 0.999898 |
After sampling, we can make the usual plots. First, let’s look at the folded light curve plot:
# Compute the GP prediction
gp_mod = np.median(trace["gp_pred"] + trace["mean"][:, None], axis=0)
# Get the posterior median orbital parameters
p = np.median(trace["period"])
t0 = np.median(trace["t0"])
# Plot the folded data
x_fold = (x[mask] - t0 + 0.5 * p) % p - 0.5 * p
plt.plot(x_fold, y[mask] - gp_mod, ".k", label="data", zorder=-1000)
# Overplot the phase binned light curve
bins = np.linspace(-0.41, 0.41, 50)
denom, _ = np.histogram(x_fold, bins)
num, _ = np.histogram(x_fold, bins, weights=y[mask])
denom[num == 0] = 1.0
plt.plot(0.5 * (bins[1:] + bins[:-1]), num / denom, "o", color="C2", label="binned")
# Plot the folded model
inds = np.argsort(x_fold)
inds = inds[np.abs(x_fold)[inds] < 0.3]
pred = trace["light_curves"][:, inds, 0]
pred = np.percentile(pred, [16, 50, 84], axis=0)
plt.plot(x_fold[inds], pred[1], color="C1", label="model")
art = plt.fill_between(
x_fold[inds], pred[0], pred[2], color="C1", alpha=0.5, zorder=1000
)
art.set_edgecolor("none")
# Annotate the plot with the planet's period
txt = "period = {0:.5f} +/- {1:.5f} d".format(
np.mean(trace["period"]), np.std(trace["period"])
)
plt.annotate(
txt,
(0, 0),
xycoords="axes fraction",
xytext=(5, 5),
textcoords="offset points",
ha="left",
va="bottom",
fontsize=12,
)
plt.legend(fontsize=10, loc=4)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("time since transit [days]")
plt.ylabel("de-trended flux")
plt.xlim(-0.15, 0.15);
And a corner plot of some of the key parameters:
import corner
import astropy.units as u
varnames = ["period", "b", "ecc", "r_pl"]
samples = pm.trace_to_dataframe(trace, varnames=varnames)
# Convert the radius to Earth radii
samples["r_pl"] = (np.array(samples["r_pl"]) * u.R_sun).to(u.R_earth).value
corner.corner(
samples[["period", "r_pl", "b", "ecc"]],
labels=["period [days]", "radius [Earth radii]", "impact param", "eccentricity"],
);
These all seem consistent with the previously published values and an earlier inconsistency between this radius measurement and the literature has been resolved by fixing a bug in exoplanet.
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:agol19, exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:foremanmackey17, exoplanet:foremanmackey18, exoplanet:kipping13, exoplanet:kipping13b, exoplanet:luger18, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet: exoplanet v0.2.1},
month = sep,
year = 2019,
doi = {10.5281/zenodo.3462740},
url = {https://doi.org/10.5281/zenodo.3462740}
}
...
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Fitting for or marginalizing over the transit times or transit timing
variations (TTVs) can be useful for several reasons, and it is a
compelling use case for exoplanet
becuase the number of parameters
in the model increases significantly because there will be a new
parameter for each transit. The performance of the NUTS sampler used by
exoplanet
scales well with the number of parameters, so a TTV model
should be substantially faster to run to convergence with exoplanet
than with other tools. There are a few definitions and subtleties that
should be considered before jumping in.
In this tutorial, we will be using a “descriptive” model
orbits.TTVOrbit
to fit the light curve where the underlying
motion is still Keplerian, but the time coordinate is warped to make
t0
a function of time. All of the other orbital elements besides
t0
are shared across all orbits, but the t0
for each transit
will be a parameter. This means that other variations (like transit
duration variations) are not currently supported, but it would be
possible to include more general effects. exoplanet
also supports
photodynamics modeling using the orbits.ReboundOrbit
for more
detailed analysis, but that is a topic for a future tutorial.
It is also important to note that “transit time” within exoplanet
(and most other transit fitting software) is defined as the time of
conjunction (called t0
in the code): the time when the true anomaly
is \(\pi/2 - \omega\). Section 18 of the EXOFASTv2
paper includes an excellent
discussion of some of the commonly used definitions of “transit time” in
the literature.
Finally, there is a subtlety in the definition of the “period” of an
orbit with TTVs. Two possible definitions are: (1) the average time
between transits, or (2) the slope of a least squares fit to the transit
times as a function of transit number. In exoplanet
, we use the
latter definition and call this parameter the ttv_period
to
distinguish it from the period
of the underlying Keplerian motion
which sets the shape and duration of the transit. By default, these two
periods are constrained to be equal, but it can be useful to fit for
both parameters since the shape of the transit might not be perfectly
described by the same period. That being said, if you fit for both
periods, make sure that you constrain ttv_period
and period
to
be similar or things can get a bit ugly.
To get started, let’s generate some simulated transit times. We’ll use
the orbits.ttv.compute_expected_transit_times()
function to get
the expected transit times for a linear ephemeris within some
observation baseline:
import numpy as np
import matplotlib.pyplot as plt
import exoplanet as xo
np.random.seed(3948)
true_periods = np.random.uniform(8, 12, 2)
true_t0s = true_periods * np.random.rand(2)
t = np.arange(0, 80, 0.01)
texp = 0.01
yerr = 5e-4
# Compute the transit times for a linear ephemeris
true_transit_times = xo.orbits.ttv.compute_expected_transit_times(
t.min(), t.max(), true_periods, true_t0s
)
# Simulate transit timing variations using a simple model
true_ttvs = [
(0.05 - (i % 2) * 0.1) * np.sin(2 * np.pi * tt / 23.7)
for i, tt in enumerate(true_transit_times)
]
true_transit_times = [tt + v for tt, v in zip(true_transit_times, true_ttvs)]
# Plot the true TTV model
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
ax1.plot(true_transit_times[0], true_ttvs[0], ".k")
ax1.axhline(0, color="k", lw=0.5)
ax1.set_ylim(np.max(np.abs(ax1.get_ylim())) * np.array([-1, 1]))
ax1.set_ylabel("$O-C$ [days]")
ax2.plot(true_transit_times[1], true_ttvs[1], ".k")
ax2.axhline(0, color="k", lw=0.5)
ax2.set_ylim(np.max(np.abs(ax2.get_ylim())) * np.array([-1, 1]))
ax2.set_ylabel("$O-C$ [days]")
ax2.set_xlabel("transit time [days]")
ax1.set_title("true TTVs");
Now, like in the Transit fitting tutorial, we’ll set up the the model
using PyMC3
and exoplanet
, and then simulate a data set from
that model.
import pymc3 as pm
import theano.tensor as tt
np.random.seed(9485023)
with pm.Model() as model:
# This part of the model is similar to the model in the `transit` tutorial
mean = pm.Normal("mean", mu=0.0, sd=1.0)
u = xo.distributions.QuadLimbDark("u", testval=np.array([0.3, 0.2]))
logr = pm.Uniform(
"logr",
lower=np.log(0.01),
upper=np.log(0.1),
shape=2,
testval=np.log([0.04, 0.06]),
)
r = pm.Deterministic("r", tt.exp(logr))
b = xo.distributions.ImpactParameter(
"b", ror=r, shape=2, testval=0.5 * np.random.rand(2)
)
# Now we have a parameter for each transit time for each planet:
transit_times = []
for i in range(2):
transit_times.append(
pm.Normal(
"tts_{0}".format(i),
mu=true_transit_times[i],
sd=1.0,
shape=len(true_transit_times[i]),
)
)
# Set up an orbit for the planets
orbit = xo.orbits.TTVOrbit(b=b, transit_times=transit_times)
# It will be useful later to track some parameters of the orbit
pm.Deterministic("t0", orbit.t0)
pm.Deterministic("period", orbit.period)
for i in range(2):
pm.Deterministic("ttvs_{0}".format(i), orbit.ttvs[i])
# The rest of this block follows the transit fitting tutorial
light_curves = xo.LimbDarkLightCurve(u).get_light_curve(
orbit=orbit, r=r, t=t, texp=texp
)
light_curve = pm.math.sum(light_curves, axis=-1) + mean
pm.Deterministic("light_curves", light_curves)
y = xo.eval_in_model(light_curve)
y += yerr * np.random.randn(len(y))
pm.Normal("obs", mu=light_curve, sd=yerr, observed=y)
map_soln = model.test_point
map_soln = xo.optimize(start=map_soln, vars=transit_times)
map_soln = xo.optimize(start=map_soln, vars=[r, b])
map_soln = xo.optimize(start=map_soln, vars=transit_times)
map_soln = xo.optimize(start=map_soln)
optimizing logp for variables: [tts_1, tts_0]
120it [00:04, 26.01it/s, logp=4.946128e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 49454.87884962603 -> 49461.28172640355
optimizing logp for variables: [b, logr]
18it [00:00, 23.53it/s, logp=4.946356e+04]
message: Optimization terminated successfully.
logp: 49461.28172640355 -> 49463.56218609357
optimizing logp for variables: [tts_1, tts_0]
53it [00:01, 48.16it/s, logp=4.946363e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 49463.56218609357 -> 49463.626773189084
optimizing logp for variables: [tts_1, tts_0, b, logr, u, mean]
113it [00:01, 88.99it/s, logp=4.946400e+04]
message: Desired error not necessarily achieved due to precision loss.
logp: 49463.626773189084 -> 49464.004718781114
Here’s our simulated light curve and the initial model:
plt.plot(t, y, ".k", ms=4, label="data")
for i, l in enumerate("bc"):
plt.plot(t, map_soln["light_curves"][:, i], lw=1, label="planet {0}".format(l))
plt.xlim(t.min(), t.max())
plt.ylabel("relative flux")
plt.xlabel("time [days]")
plt.legend(fontsize=10)
plt.title("map model");
This looks similar to the light curve from the Transit fitting tutorial, but if we try plotting the folded transits, we can see that something isn’t right: these transits look pretty smeared out!
for n, letter in enumerate("bc"):
plt.figure()
# Get the posterior median orbital parameters
p = map_soln["period"][n]
t0 = map_soln["t0"][n]
# Compute the median of posterior estimate of the contribution from
# the other planet. Then we can remove this from the data to plot
# just the planet we care about.
other = map_soln["light_curves"][:, (n + 1) % 2]
# Plot the folded data
x_fold = (t - t0 + 0.5 * p) % p - 0.5 * p
plt.errorbar(x_fold, y - other, yerr=yerr, fmt=".k", label="data", zorder=-1000)
plt.legend(fontsize=10, loc=4)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("time since transit [days]")
plt.ylabel("relative flux")
plt.title("planet {0}".format(letter))
plt.xlim(-0.3, 0.3)
Instead, we can correct for the transit times by removing the best fit transit times and plot that instead:
with model:
t_warp = xo.eval_in_model(orbit._warp_times(t), map_soln)
for n, letter in enumerate("bc"):
plt.figure()
p = map_soln["period"][n]
other = map_soln["light_curves"][:, (n + 1) % 2]
# NOTE: 't0' has already been subtracted!
x_fold = (t_warp[:, n] + 0.5 * p) % p - 0.5 * p
plt.errorbar(x_fold, y - other, yerr=yerr, fmt=".k", label="data", zorder=-1000)
plt.legend(fontsize=10, loc=4)
plt.xlim(-0.5 * p, 0.5 * p)
plt.xlabel("time since transit [days]")
plt.ylabel("relative flux")
plt.title("planet {0}".format(letter))
plt.xlim(-0.3, 0.3)
That looks better!
Now let’s run some MCMC as usual:
np.random.seed(230948)
with model:
trace = pm.sample(
tune=1000,
draws=1000,
start=map_soln,
step=xo.get_dense_nuts_step(target_accept=0.9),
)
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [tts_1, tts_0, b, logr, u, mean]
Sampling 4 chains: 100%|██████████| 8000/8000 [01:52<00:00, 71.38draws/s]
Then check the convergence diagnostics:
pm.summary(trace, varnames=["mean", "u", "logr", "b", "tts_0", "tts_1"])
mean | sd | mc_error | hpd_2.5 | hpd_97.5 | n_eff | Rhat | |
---|---|---|---|---|---|---|---|
mean | -0.000003 | 0.000006 | 6.036550e-08 | -0.000014 | 0.000008 | 7951.264445 | 0.999598 |
u__0 | 0.342717 | 0.171603 | 3.127874e-03 | 0.012857 | 0.625676 | 3307.235257 | 0.999786 |
u__1 | 0.178007 | 0.297945 | 5.648345e-03 | -0.324534 | 0.719815 | 2888.604157 | 0.999622 |
logr__0 | -3.222882 | 0.019438 | 3.078609e-04 | -3.261813 | -3.186574 | 3630.177143 | 0.999938 |
logr__1 | -2.813157 | 0.012714 | 2.314427e-04 | -2.839966 | -2.790918 | 2983.225881 | 0.999739 |
b__0 | 0.394382 | 0.043633 | 8.273671e-04 | 0.305350 | 0.473317 | 2841.660084 | 1.000083 |
b__1 | 0.354436 | 0.030505 | 5.763580e-04 | 0.293155 | 0.408723 | 2606.185780 | 0.999798 |
tts_0__0 | 6.962988 | 0.004645 | 7.883311e-05 | 6.954894 | 6.972102 | 3957.707556 | 1.000532 |
tts_0__1 | 15.105173 | 0.006857 | 1.375224e-04 | 15.091271 | 15.116490 | 2590.207778 | 1.000638 |
tts_0__2 | 23.380438 | 0.002205 | 2.809419e-05 | 23.376013 | 23.384614 | 5843.603039 | 0.999822 |
tts_0__3 | 31.666856 | 0.003857 | 4.944050e-05 | 31.659030 | 31.673703 | 6286.150816 | 0.999847 |
tts_0__4 | 39.813239 | 0.002718 | 3.407893e-05 | 39.808318 | 39.818873 | 6735.112313 | 0.999776 |
tts_0__5 | 48.105357 | 0.003237 | 4.216364e-05 | 48.099165 | 48.111476 | 6479.282957 | 0.999918 |
tts_0__6 | 56.371431 | 0.003099 | 4.131000e-05 | 56.365803 | 56.378123 | 5316.468567 | 0.999730 |
tts_0__7 | 64.525313 | 0.001875 | 2.405386e-05 | 64.521625 | 64.528931 | 7065.382141 | 1.000028 |
tts_0__8 | 72.834911 | 0.002607 | 3.131700e-05 | 72.829892 | 72.840116 | 7760.060791 | 0.999530 |
tts_1__0 | 1.970518 | 0.001536 | 1.936021e-05 | 1.967514 | 1.973597 | 6230.401871 | 0.999885 |
tts_1__1 | 13.395061 | 0.001442 | 1.642121e-05 | 13.392272 | 13.397960 | 7184.991129 | 0.999723 |
tts_1__2 | 24.743927 | 0.001394 | 1.624029e-05 | 24.741089 | 24.746494 | 7539.455329 | 0.999579 |
tts_1__3 | 36.143523 | 0.001219 | 1.501228e-05 | 36.141029 | 36.145809 | 7086.565692 | 0.999945 |
tts_1__4 | 47.512046 | 0.001427 | 1.718606e-05 | 47.509357 | 47.514923 | 8361.206162 | 1.000228 |
tts_1__5 | 58.888722 | 0.001301 | 1.681865e-05 | 58.886356 | 58.891440 | 6284.273858 | 1.000194 |
tts_1__6 | 70.284253 | 0.001227 | 1.386730e-05 | 70.281896 | 70.286663 | 6423.395074 | 0.999581 |
And plot the corner plot of the physical parameters:
import corner
with model:
truths = np.concatenate(
list(map(np.atleast_1d, xo.eval_in_model([orbit.period, r, b])))
)
samples = pm.trace_to_dataframe(trace, varnames=["period", "r", "b"])
corner.corner(
samples,
truths=truths,
labels=["period 1", "period 2", "radius 1", "radius 2", "impact 1", "impact 2"],
);
We could also plot corner plots of the transit times, but they’re not terribly enlightening in this case so let’s skip it.
Finally, let’s plot the posterior estimates of the the transit times in an O-C diagram:
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
q = np.percentile(trace["ttvs_0"], [16, 50, 84], axis=0)
ax1.fill_between(
np.mean(trace["tts_0"], axis=0), q[0], q[2], color="C0", alpha=0.4, edgecolor="none"
)
ref = np.polyval(
np.polyfit(true_transit_times[0], true_ttvs[0], 1), true_transit_times[0]
)
ax1.plot(true_transit_times[0], true_ttvs[0] - ref, ".k")
ax1.axhline(0, color="k", lw=0.5)
ax1.set_ylim(np.max(np.abs(ax1.get_ylim())) * np.array([-1, 1]))
ax1.set_ylabel("$O-C$ [days]")
q = np.percentile(trace["ttvs_1"], [16, 50, 84], axis=0)
ax2.fill_between(
np.mean(trace["tts_1"], axis=0), q[0], q[2], color="C1", alpha=0.4, edgecolor="none"
)
ref = np.polyval(
np.polyfit(true_transit_times[1], true_ttvs[1], 1), true_transit_times[1]
)
ax2.plot(true_transit_times[1], true_ttvs[1] - ref, ".k", label="truth")
ax2.axhline(0, color="k", lw=0.5)
ax2.set_ylim(np.max(np.abs(ax2.get_ylim())) * np.array([-1, 1]))
ax2.legend(fontsize=10)
ax2.set_ylabel("$O-C$ [days]")
ax2.set_xlabel("transit time [days]")
ax1.set_title("posterior inference");
As described in the Citing exoplanet & its dependencies tutorial, we can use
exoplanet.citations.get_citations_for_model()
to construct an
acknowledgement and BibTeX listing that includes the relevant citations
for this model.
with model:
txt, bib = xo.citations.get_citations_for_model()
print(txt)
This research made use of textsf{exoplanet} citep{exoplanet} and its dependencies citep{exoplanet:agol19, exoplanet:astropy13, exoplanet:astropy18, exoplanet:exoplanet, exoplanet:kipping13, exoplanet:luger18, exoplanet:pymc3, exoplanet:theano}.
print("\n".join(bib.splitlines()[:10]) + "\n...")
@misc{exoplanet:exoplanet,
author = {Daniel Foreman-Mackey and Ian Czekala and Rodrigo Luger and
Eric Agol and Geert Barentsen and Tom Barclay},
title = {dfm/exoplanet},
month = sep,
year = 2019,
doi = {10.5281/zenodo.1998447},
url = {https://doi.org/10.5281/zenodo.1998447}
}
...
Copyright 2018, 2019 Daniel Foreman-Mackey.
The source code is made available under the terms of the MIT license.
If you make use of this code, please cite this package and its dependencies. You can find more information about how and what to cite in the Citing exoplanet & its dependencies documentation.
These docs were made using Sphinx and the Typlog theme. They are built and hosted on Read the Docs.
TTVOrbit
tutorialTTVOrbit
IntegratedTerm
modelTTVOrbit
models with large TTVsstarry
to get much better performance for high order spherical
harmonicsStarryLightCurve
to LimbDarkLightCurve
celerite
modelsAngle
distribution when the value of the angle
is well constrainedoptimize
function since the find_MAP
method
in PyMC3 is deprecatedPyMC3Sampler