The fatiando package has been deprecated. Please check out the new tools in the Fatiando a Terra website: www.fatiando.org

Source code for fatiando.inversion.hyper_param

r"""
Classes for hyper parameter estimation (like the regularizing parameter).

These classes copy the interface of the standard inversion classes based on
:class:`~fatiando.inversion.misfit.Misfit` (i.e.,
``solver.config(...).fit().estimate_``). When their ``fit`` method is called,
they perform many runs of the inversion and try to select the optimal values
for the hyper parameters. The class will then behave as the solver that yields
the best estimate (e.g., ``solver[0].predicted()``).

Available classes:

* :class:`~fatiando.inversion.hyper_param.LCurve`: Estimate the regularizing
  parameter using an L-curve analysis.

----

"""
from __future__ import division
import multiprocessing
import numpy

from ..vis import mpl
from .base import OptimizerMixin

__all__ = ['LCurve']


[docs]class LCurve(OptimizerMixin): """ Use the L-curve criterion to estimate the regularization parameter. Runs the inversion using several specified regularization parameters. The best value is the one that falls on the corner of the log-log plot of the data misfit vs regularizing function. This point is automatically found using the triangle method of Castellanos et al. (2002). This class behaves as :class:`~fatiando.inversion.base.Misfit`. To use it, simply call ``fit`` and optionally ``config``. The estimate will be stored in ``estimate_`` and ``p_``. The estimated regularization parameter will be stored in ``regul_param_``. Parameters: * datamisfit : :class:`~fatiando.inversion.base.Misfit` The data misfit instance for the inverse problem. Can be a sum of other misfits. * regul : A class from :mod:`fatiando.inversion.regularization` The regularizing function. * regul_params : list The values of the regularization parameter that will be tested. * loglog : True or False If True, will use a log-log scale for the L-curve (recommended). * jobs : None or int If not None, will use *jobs* processes to calculate the L-curve. References: Castellanos, J. L., S. Gomez, and V. Guerra (2002), The triangle method for finding the corner of the L-curve, Applied Numerical Mathematics, 43(4), 359-373, doi:10.1016/S0168-9274(01)00179-9. Examples: We'll use the L-curve to estimate the best regularization parameter for a smooth inversion using :mod:`fatiando.seismic.srtomo`. First, we'll setup some synthetic data: >>> import numpy >>> from fatiando.mesher import SquareMesh >>> from fatiando.seismic import ttime2d, srtomo >>> from fatiando.inversion import Smoothness2D, LCurve >>> from fatiando import utils, gridder >>> area = (0, 2, 0, 2) >>> shape = (10, 10) >>> model = SquareMesh(area, shape) >>> vp = 4*numpy.ones(shape) >>> vp[3:7,3:7] = 10 >>> vp array([[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.]]) >>> model.addprop('vp', vp.ravel()) >>> src_loc_x, src_loc_y = gridder.scatter(area, 30, seed=0) >>> src_loc = numpy.transpose([src_loc_x, src_loc_y]) >>> rec_loc_x, rec_loc_y = gridder.circular_scatter(area, 20, ... random=True, seed=0) >>> rec_loc = numpy.transpose([rec_loc_x, rec_loc_y]) >>> srcs = [src for src in src_loc for _ in rec_loc] >>> recs = [rec for _ in src_loc for rec in rec_loc] >>> tts = ttime2d.straight(model, 'vp', srcs, recs) >>> tts = utils.contaminate(tts, 0.01, percent=True, seed=0) Now we can setup a tomography by creating the necessary data misfit (``SRTomo``) and regularization (``Smoothness2D``) objects. We'll normalize the data misfit by the number of data points to make the scale of the regularization parameter more tractable. >>> mesh = SquareMesh(area, shape) >>> datamisfit = (1./tts.size)*srtomo.SRTomo(tts, srcs, recs, mesh) >>> regul = Smoothness2D(mesh.shape) The tomography solver will be the ``LCurve`` solver. It works by calling ``fit()`` and accessing ``estimate_``, exactly like any other solver: >>> regul_params = [10**i for i in range(-10, -2, 1)] >>> tomo = LCurve(datamisfit, regul, regul_params) >>> _ = tomo.fit() >>> print(numpy.array_repr(tomo.estimate_.reshape(shape), precision=0)) array([[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 11., 9., 11., 10., 4., 4., 4.], [ 4., 4., 4., 10., 11., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 11., 10., 11., 9., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.]]) When ``fit()`` is called, the ``LCurve`` will run the inversion for each value of the regularization parameter, build an l-curve, and find the best solution (i.e., the corner value of the l-curve). The ``LCurve`` object behaves like a normal multi-objective function. In fact, it will try to mirror the objective function that resulted in the best solution. You can index it to access the data misfit and regularization parts. For example, to get the residuals vector or the predicted data: >>> predicted = tomo[0].predicted() >>> residuals = tomo[0].residuals() >>> print '%.4f %.4f' % (residuals.mean(), residuals.std()) -0.0000 0.0047 The estimated regularization parameter is stored in ``regul_param_``: >>> tomo.regul_param_ 1e-05 You can run the l-curve analysis in parallel by specifying the ``njobs`` argument. This will spread the computations over ``njobs`` number of processes and give some speedup over running sequentially. Note that you should **not** enable any kind of multi-processes parallelism on the data misfit class. It is often better to run each inversion sequentially and run many of them in parallel. Note that you'll enough memory to run multiple inversions at the same time, so this is not suited for large, memory hungry inversions. >>> par_tomo = LCurve(datamisfit, regul, regul_params, njobs=2) >>> _ = par_tomo.fit() # Will you 2 processes to run inversions >>> par_tomo.regul_param_ 1e-05 >>> print(numpy.array_repr(par_tomo.estimate_.reshape(shape), precision=0)) array([[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 11., 9., 11., 10., 4., 4., 4.], [ 4., 4., 4., 10., 11., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 11., 10., 11., 9., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.]]) ``LCurve`` also has a ``config`` method to configure the optimization process for non-linear problems, for example: >>> initial = numpy.ones(mesh.size) >>> _ = tomo.config('newton', initial=initial, tol=0.2).fit() >>> tomo.regul_param_ 1e-05 >>> print(numpy.array_repr(tomo.estimate_.reshape(shape), precision=0)) array([[ 4., 4., 3., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 12., 9., 11., 10., 4., 4., 4.], [ 4., 4., 4., 11., 11., 10., 10., 4., 4., 4.], [ 4., 4., 4., 10., 10., 10., 10., 4., 4., 4.], [ 4., 4., 4., 11., 10., 11., 9., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 5., 4., 4., 4.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.]]) You can view the optimization information for the run corresponding to the best estimate using the ``stats_`` attribute: >>> list(sorted(tomo.stats_)) ['iterations', 'method', 'objective'] >>> tomo.stats_['method'] "Newton's method" >>> tomo.stats_['iterations'] 2 """ def __init__(self, datamisfit, regul, regul_params, loglog=True, njobs=1): assert njobs >= 1, "njobs should be >= 1. {} given.".format(njobs) self.regul_params = regul_params self.datamisfit = datamisfit self.regul = regul self.objectives = None self.dnorm = None self.mnorm = None self.fit_method = None self.fit_args = None self.njobs = njobs self.loglog = loglog # Estimated parameters from the L curve self.corner_ = None def _run_fit_first(self): """ Check if a solution was found by running fit. Will raise an ``AssertionError`` if not. """ assert self.corner_ is not None, \ 'No optimal solution found. Run "fit" to run the L-curve analysis.' @property def regul_param_(self): """ The regularization parameter corresponding to the best estimate. """ self._run_fit_first() return self.regul_params[self.corner_] @property def objective_(self): """ The objective function corresponding to the best estimate. """ self._run_fit_first() return self.objectives[self.corner_] @property def stats_(self): """ The optimization information for the best solution found. """ return self.objective_.stats_ @property def p_(self): """ The estimated parameter vector obtained from the best regularization parameter. """ return self.objective_.p_
[docs] def fmt_estimate(self, p): """ Return the ``estimate_`` attribute of the optimal solution. """ return self.objective_.estimate_
def __getitem__(self, i): return self.objective_[i]
[docs] def fit(self): """ Solve for the parameter vector and optimum regularization parameter. Combines the data-misfit and regularization solvers using the range of regularization parameters provided and calls ``fit`` and ``config`` on each. The ``p_`` and ``estimate_`` attributes correspond to the combination that falls in the corner of the L-curve. The regularization parameter for this corner point if stored in the ``regul_param_`` attribute. Returns: * self """ if self.datamisfit.islinear: self.datamisfit.jacobian('null') solvers = [ self.datamisfit + mu * self.regul for mu in self.regul_params] if self.fit_method is not None: for solver in solvers: solver.config(self.fit_method, **self.fit_args) if self.njobs > 1: pool = multiprocessing.Pool(self.njobs) results = pool.map(_fit_solver, solvers) pool.close() pool.join() else: results = [s.fit() for s in solvers] self.objectives = results self.dnorm = numpy.array( [self.datamisfit.value(s.p_) for s in results]) self.mnorm = numpy.array([self.regul.value(s.p_) for s in results]) self.select_corner() return self
def _scale_curve(self): """ Puts the data-misfit and regularizing function values in the range [-10, 10]. """ if self.loglog: x, y = numpy.log(self.dnorm), numpy.log(self.mnorm) else: x, y = self.dnorm, self.mnorm def scale(a): vmin, vmax = a.min(), a.max() l, u = -10, 10 return (((u - l) / (vmax - vmin)) * (a - (u * vmin - l * vmax) / (u - l))) return scale(x), scale(y)
[docs] def select_corner(self): """ Select the corner value of the L-curve formed inversion results. This is performed automatically after calling the :meth:`~fatiando.inversion.hyper_param.LCurve.fit` method. You can run this method separately after :meth:`~fatiando.inversion.hyper_param.LCurve.fit` has been called to tweak the results. You can access the estimated values by: * The ``p_`` and ``estimate_`` attributes will hold the estimated parameter vector and formatted estimate, respective, corresponding to the corner value. * The ``regul_param_`` attribute holds the value of the regularization parameter corresponding to the corner value. * The ``corner_`` attribute will hold the index of the corner value in the list of computed solutions. Uses the Triangle method of Castellanos et al. (2002). References: Castellanos, J. L., S. Gomez, and V. Guerra (2002), The triangle method for finding the corner of the L-curve, Applied Numerical Mathematics, 43(4), 359-373, doi:10.1016/S0168-9274(01)00179-9. """ x, y = self._scale_curve() n = len(self.regul_params) corner = n - 1 def dist(p1, p2): "Return the geometric distance between p1 and p2" return numpy.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2) cte = 7. * numpy.pi / 8. angmin = None c = [x[-1], y[-1]] for k in xrange(0, n - 2): b = [x[k], y[k]] for j in xrange(k + 1, n - 1): a = [x[j], y[j]] ab = dist(a, b) ac = dist(a, c) bc = dist(b, c) cosa = (ab ** 2 + ac ** 2 - bc ** 2) / (2. * ab * ac) ang = numpy.arccos(cosa) area = 0.5 * ((b[0] - a[0]) * (a[1] - c[1]) - (a[0] - c[0]) * (b[1] - a[1])) # area is > 0 because in the paper C is index 0 if area > 0 and (ang < cte and (angmin is None or ang < angmin)): corner = j angmin = ang self.corner_ = corner
[docs] def plot_lcurve(self, ax=None, guides=True): """ Make a plot of the data-misfit x regularization values. The estimated corner value is shown as a blue triangle. Parameters: * ax : matplotlib Axes If not ``None``, will plot the curve on this Axes instance. * guides : True or False Plot vertical and horizontal lines across the corner value. """ if ax is None: ax = mpl.gca() else: mpl.sca(ax) x, y = self.dnorm, self.mnorm if self.loglog: mpl.loglog(x, y, '.-k') else: mpl.plot(x, y, '.-k') if guides: vmin, vmax = ax.get_ybound() mpl.vlines(x[self.corner_], vmin, vmax) vmin, vmax = ax.get_xbound() mpl.hlines(y[self.corner_], vmin, vmax) mpl.plot(x[self.corner_], y[self.corner_], '^b', markersize=10) mpl.xlabel('Data misfit') mpl.ylabel('Regularization')
def _fit_solver(solver): """ Call ``fit`` on the solver. Needed for multiprocessing. """ return solver.fit()