Hyper parameter optimization (`fatiando.inversion.hyper_param`)¶

Classes for hyper parameter estimation (like the regularizing parameter).

These classes copy the interface of the standard inversion classes based on Misfit (i.e., solver.config(...).fit().estimate_). When their fit method is called, they perform many runs of the inversion and try to select the optimal values for the hyper parameters. The class will then behave as the solver that yields the best estimate (e.g., solver[0].predicted()).

Available classes:

LCurve: Estimate the regularizing parameter using an L-curve analysis.

class fatiando.inversion.hyper_param.LCurve(datamisfit, regul, regul_params, loglog=True, njobs=1)[source]¶

Bases: fatiando.inversion.base.OptimizerMixin

Use the L-curve criterion to estimate the regularization parameter.

Runs the inversion using several specified regularization parameters. The best value is the one that falls on the corner of the log-log plot of the data misfit vs regularizing function. This point is automatically found using the triangle method of Castellanos et al. (2002).

This class behaves as Misfit. To use it, simply call fit and optionally config. The estimate will be stored in estimate_ and p_. The estimated regularization parameter will be stored in regul_param_.

Parameters:

datamisfit
: Misfit
The data misfit instance for the inverse problem. Can be a sum of other misfits.
regul
: A class from fatiando.inversion.regularization
The regularizing function.
regul_params
: list
The values of the regularization parameter that will be tested.
loglog
: True or False
If True, will use a log-log scale for the L-curve (recommended).
jobs
: None or int
If not None, will use jobs processes to calculate the L-curve.

References:

Castellanos, J. L., S. Gomez, and V. Guerra (2002), The triangle method for finding the corner of the L-curve, Applied Numerical Mathematics, 43(4), 359-373, doi:10.1016/S0168-9274(01)00179-9.

Examples:

We’ll use the L-curve to estimate the best regularization parameter for a smooth inversion using fatiando.seismic.srtomo.

First, we’ll setup some synthetic data:

>>> import numpy
>>> from fatiando.mesher import SquareMesh
>>> from fatiando.seismic import ttime2d, srtomo
>>> from fatiando.inversion import Smoothness2D, LCurve
>>> from fatiando import utils, gridder
>>> area = (0, 2, 0, 2)
>>> shape = (10, 10)
>>> model = SquareMesh(area, shape)
>>> vp = 4*numpy.ones(shape)
>>> vp[3:7,3:7] = 10
>>> vp
array([[  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.]])
>>> model.addprop('vp', vp.ravel())
>>> src_loc_x, src_loc_y = gridder.scatter(area, 30, seed=0)
>>> src_loc = numpy.transpose([src_loc_x, src_loc_y])
>>> rec_loc_x, rec_loc_y = gridder.circular_scatter(area, 20,
...                                                 random=True, seed=0)
>>> rec_loc = numpy.transpose([rec_loc_x, rec_loc_y])
>>> srcs = [src for src in src_loc for _ in rec_loc]
>>> recs = [rec for _ in src_loc for rec in rec_loc]
>>> tts = ttime2d.straight(model, 'vp', srcs, recs)
>>> tts = utils.contaminate(tts, 0.01, percent=True, seed=0)

Now we can setup a tomography by creating the necessary data misfit (SRTomo) and regularization (Smoothness2D) objects. We’ll normalize the data misfit by the number of data points to make the scale of the regularization parameter more tractable.

>>> mesh = SquareMesh(area, shape)
>>> datamisfit = (1./tts.size)*srtomo.SRTomo(tts, srcs, recs, mesh)
>>> regul = Smoothness2D(mesh.shape)

The tomography solver will be the LCurve solver. It works by calling fit() and accessing estimate_, exactly like any other solver:

>>> regul_params = [10**i for i in range(-10, -2, 1)]
>>> tomo = LCurve(datamisfit, regul, regul_params)
>>> _ = tomo.fit()
>>> print(numpy.array_repr(tomo.estimate_.reshape(shape), precision=0))
array([[  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,   9.,  11.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  11.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,  10.,  11.,   9.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.]])

When fit() is called, the LCurve will run the inversion for each value of the regularization parameter, build an l-curve, and find the best solution (i.e., the corner value of the l-curve).

The LCurve object behaves like a normal multi-objective function. In fact, it will try to mirror the objective function that resulted in the best solution. You can index it to access the data misfit and regularization parts. For example, to get the residuals vector or the predicted data:

>>> predicted = tomo[0].predicted()
>>> residuals = tomo[0].residuals()
>>> print '%.4f %.4f' % (residuals.mean(), residuals.std())
-0.0000 0.0047

The estimated regularization parameter is stored in regul_param_:

>>> tomo.regul_param_
1e-05

You can run the l-curve analysis in parallel by specifying the njobs argument. This will spread the computations over njobs number of processes and give some speedup over running sequentially. Note that you should not enable any kind of multi-processes parallelism on the data misfit class. It is often better to run each inversion sequentially and run many of them in parallel. Note that you’ll enough memory to run multiple inversions at the same time, so this is not suited for large, memory hungry inversions.

>>> par_tomo = LCurve(datamisfit, regul, regul_params, njobs=2)
>>> _ = par_tomo.fit()  # Will you 2 processes to run inversions
>>> par_tomo.regul_param_
1e-05
>>> print(numpy.array_repr(par_tomo.estimate_.reshape(shape), precision=0))
array([[  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,   9.,  11.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  11.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,  10.,  11.,   9.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.]])

LCurve also has a config method to configure the optimization process for non-linear problems, for example:

>>> initial = numpy.ones(mesh.size)
>>> _ = tomo.config('newton', initial=initial, tol=0.2).fit()
>>> tomo.regul_param_
1e-05
>>> print(numpy.array_repr(tomo.estimate_.reshape(shape), precision=0))
array([[  4.,   4.,   3.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  12.,   9.,  11.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,  11.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  10.,  10.,  10.,  10.,   4.,   4.,   4.],
       [  4.,   4.,   4.,  11.,  10.,  11.,   9.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   5.,   4.,   4.,   4.],
       [  4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.,   4.]])

You can view the optimization information for the run corresponding to the best estimate using the stats_ attribute:

>>> list(sorted(tomo.stats_))
['iterations', 'method', 'objective']
>>> tomo.stats_['method']
"Newton's method"
>>> tomo.stats_['iterations']
2

config(method, **kwargs)¶

Configure the optimization method and its parameters.

This sets the method used by fit and the keyword arguments that are passed to it.

Parameters:

method
: string
The optimization method. One of: 'linear', 'newton', 'levmarq', 'steepest', 'acor'

Other keyword arguments that can be passed are the ones allowed by each method.

Some methods have required arguments:

newton, levmarq and steepest require the initial argument (an initial estimate for the gradient descent)
acor requires the bounds argument (min/max values for the search space)

See the corresponding docstrings for more information:

estimate_¶

A nicely formatted version of the estimate.

If the class implements a fmt_estimate method, this will its results. This can be used to convert the parameter vector to a more useful form, like a fatiando.mesher object.

fit()[source]¶

Solve for the parameter vector and optimum regularization parameter.

Combines the data-misfit and regularization solvers using the range of regularization parameters provided and calls fit and config on each.

The p_ and estimate_ attributes correspond to the combination that falls in the corner of the L-curve.

The regularization parameter for this corner point if stored in the regul_param_ attribute.

Returns:

self

fmt_estimate(p)[source]¶: Return the estimate_ attribute of the optimal solution.

objective_¶: The objective function corresponding to the best estimate.

p_¶: The estimated parameter vector obtained from the best regularization parameter.

plot_lcurve(ax=None, guides=True)[source]¶

Make a plot of the data-misfit x regularization values.

The estimated corner value is shown as a blue triangle.

Parameters:

ax
: matplotlib Axes
If not None, will plot the curve on this Axes instance.
guides
: True or False
Plot vertical and horizontal lines across the corner value.

regul_param_¶: The regularization parameter corresponding to the best estimate.

select_corner()[source]¶

Select the corner value of the L-curve formed inversion results.

This is performed automatically after calling the fit method. You can run this method separately after fit has been called to tweak the results.

You can access the estimated values by:

The p_ and estimate_ attributes will hold the estimated parameter vector and formatted estimate, respective, corresponding to the corner value.
The regul_param_ attribute holds the value of the regularization parameter corresponding to the corner value.
The corner_ attribute will hold the index of the corner value in the list of computed solutions.

Uses the Triangle method of Castellanos et al. (2002).

References:

Castellanos, J. L., S. Gomez, and V. Guerra (2002), The triangle method for finding the corner of the L-curve, Applied Numerical Mathematics, 43(4), 359-373, doi:10.1016/S0168-9274(01)00179-9.

stats_¶: The optimization information for the best solution found.

Hyper parameter optimization (fatiando.inversion.hyper_param)¶

Hyper parameter optimization (`fatiando.inversion.hyper_param`)¶