fatiando.inversion
)¶Everything you need to solve inverse problems!
This package provides the basic building blocks to implement inverse problem
solvers. The main class for this is Misfit
.
It represents a data-misfit function and contains various tools to fit a
model to some data. All you have to do is implement methods to calculate the
predicted (or modeled) data and (optionally) the Jacobian (or sensitivity)
matrix. With only that, you have access to a range of optimization methods,
regularization, joint inversion, etc.
misfit
: Defines the data-misfit classes. Used to
implement new inverse problems. The main class is Misfit
. It offers a
template for you to create standard least-squares inversion methods.regularization
: Classes for common regularizing
functions and base classes for building new ones.hyper_param
: Classes hyper parameter optimization
(estimating the regularization parameter), like L-curve analysis and (in the
future) cross-validation.optimization
: Functions for several optimization
methods (used internally by Misfit
).
In most cases you won’t need to touch this.base
: Base classes used internally to define
common operations and method.
In most cases you won’t need to touch this.You can import the Misfit
, regularization, and hyper parameter optimization
classes directly from the fatiando.inversion
namespace:
>>> from fatiando.inversion import Misfit, LCurve, Damping, Smoothness
The Misfit
class is used internally in
Fatiando to implement all of our inversion algorithms. Take a look at the
modules below for some examples:
The Misfit
class works by subclassing it. Doing this gives you access to
all optimization functions and possible combinations of regularization. When
subclassing Misfit
, you’ll need to implement the predicted
method that
calculates a predicted data vector based on an input parameter vector. This is
virtually all that is problem-specific in an inverse problem. If you want to
use gradient-based optimization (or linear problems) you’ll need to implement
the jacobian
method as well that calculates the Jacobian (or sensitivity)
matrix.
Here is an example of how to implement a simple linear regression using the
Misfit
class.
What we want to do is fit \(f(a, b, x) = y = ax + b\) to find a and b.
Putting a and b in a parameter vector p = [a, b]
we get:
where \(\mathbf{d}\) is the data vector containing all the values of y and \(\mathbf{A}\) is the Jacobian matrix with the values of x in the first column and 1 in the second column.
All we have to do to implement a solver for this problem is write the
predicted
(to calculate y from values of a and b) and jacobian
(to
calculate the Jacobian matrix):
First, I’ll load numpy and the Misfit
class.
>>> import numpy as np
>>> from fatiando.inversion import Misfit
We’ll also need some compatibility code to support Python 2 and 3 at the same time.
>>> from __future__ import division
>>> from future.builtins import super
Now, I’ll make my regression class that inherits from Misfit
.
All of the least-squares solving code and much more we get for free just by
using Misfit
as template for our regression class. Note Misfit
wants a
1D data vector, no matter what your data is (line, grid, volume).
>>> class Regression(Misfit):
... "Perform a linear regression"
... def __init__(self, x, y):
... # Call the initialization of Misfit
... super().__init__(data=y, nparams=2, islinear=True)
... self.x = x # Store this to use in the other methods
... def predicted(self, p):
... "Calculate the predicted data for a given parameter vector"
... a, b = p
... return a*self.x + b
... def jacobian(self, p):
... "Calculate the Jacobian (ignores p because the problem is linear)"
... jac = np.empty((self.ndata, self.nparams))
... jac[:, 0] = self.x
... jac[:, 1] = 1
... return jac
To test our regression, let’s generate some data based on known values of a and b.
>>> x = np.linspace(0, 5, 6)
>>> x
array([ 0., 1., 2., 3., 4., 5.])
>>> y = 2*x + 5
>>> y
array([ 5., 7., 9., 11., 13., 15.])
We must create a solver object to perform our regression. Fitting the data
(running the optimization) is done by calling the fit
method.
fit
returns the regression class itself we can chain calls to it like so:
>>> solver = Regression(x, y).fit()
The estimated parameter vector is stored in the p_
attribute.
Misfit
also provides a estimate_
attribute that can be a custom (user
defined) formatted version of p_
. It’s better to use estimate_
if
you’re not interested in the parameter vector as it is. Since we didn’t
implement this custom formatting, both should give the same value.
>>> solver.p_
array([ 2., 5.])
>>> solver.estimate_
array([ 2., 5.])
The methods predicted
and residuals
will use the cached values based
on p_
if the parameter vector is omitted as an argument.
>>> solver.predicted()
array([ 5., 7., 9., 11., 13., 15.])
>>> np.all(np.abs(solver.residuals()) < 10**10)
True
The Misfit
class caches some of the matrices that it computes, like the
Jacobian matrix. This is needed because separate methods call jacobian
with
the same input p
, so recomputing the matrix would be a waste.
For linear problems, the Jacobian matrix is cached permanently. It is only
calculated once, no matter what p
is passed to it.
>>> A = solver.jacobian(solver.p_)
>>> A
array([[ 0., 1.],
[ 1., 1.],
[ 2., 1.],
[ 3., 1.],
[ 4., 1.],
[ 5., 1.]])
>>> B = solver.jacobian(np.array([20, 30]))
>>> B
array([[ 0., 1.],
[ 1., 1.],
[ 2., 1.],
[ 3., 1.],
[ 4., 1.],
[ 5., 1.]])
>>> A is B
True
For non-linear problems, the Jacobian is also cached but it will be
recalculated if passed a different value of p
(see the
non-linear example below).
The Hessian matrix (\(2\mathbf{A}^T\mathbf{A}\)) is also cached permanently for linear problems.
>>> H = solver.hessian(solver.p_)
>>> H
array([[ 110., 30.],
[ 30., 12.]])
>>> H2 = solver.hessian(np.array([20, 30]))
>>> H2
array([[ 110., 30.],
[ 30., 12.]])
>>> H is H2
True
You can configure the solver using the config
method. This allows you to
choose the optimization method you want to use and it’s corresponding
parameters. The config
method also returns the solver class itself so it
can be chained with fit
:
>>> # Configure solver to use the Levemberg-Marquardt method
>>> solver.config('levmarq', initial=[1, 1]).fit().estimate_
array([ 2., 5.])
>>> np.all(np.abs(solver.residuals()) < 10**10)
True
>>> # or the Steepest Descent method
>>> solver.config('steepest', initial=[1, 1]).fit().estimate_
array([ 2., 5.])
>>> # or the Gauss-Newton method
>>> solver.config('newton', initial=[1, 1], maxit=5).fit().estimate_
array([ 2., 5.])
The Misfit
class keeps track of the optimization process and makes that
data available through the stats_
attribute (a dictionary).
>>> list(sorted(solver.stats_))
['iterations', 'method', 'objective']
>>> solver.stats_['method']
"Newton's method"
>>> solver.stats_['iterations']
5
The 'objective'
key holds a list of the objective function value per
iteration of the optimization process.
Misfit
allows you to set weights to the data in the form of a weight
matrix or vector (the vector is assumed to be the diagonal of the weight
matrix). We can use this to perform a re-weighted least-squares fit to remove
outliers from our data.
>>> y_outlier = y.copy()
>>> y_outlier[3] += 20
>>> y_outlier
array([ 5., 7., 9., 31., 13., 15.])
First, we run the regression without any weights.
>>> solver = Regression(x, y_outlier).fit()
>>> print(np.array_repr(solver.estimate_, precision=3))
array([ 2.571, 6.905])
Now, we can use the inverse of the residuals to set the weights for our data. We repeat this for a few iterations and should have our robust estimate by the end of it.
>>> for i in range(20):
... r = np.abs(solver.residuals())
... # Avoid small residuals because of zero-division errors
... r[r < 1e-10] = 1
... _ = solver.set_weights(1/r).fit()
>>> solver.estimate_
array([ 2., 5.])
In this example, I want to fit a Gaussian to my data:
Function f is non-linear with respect to inversion parameters a, b, c.
Thus, we need to configure the solver and choose an optimization method before
we can call fit()
.
First, lets create our solver class based on Misfit
and implement the
predicted
and jacobian
methods.
>>> class GaussianFit(Misfit):
... def __init__(self, x, y):
... super().__init__(
... data=y, nparams=3, islinear=False)
... self.x = x
... def predicted(self, p):
... a, b, c = p
... return a*np.exp(-b*(self.x + c)**2)
... def jacobian(self, p):
... a, b, c = p
... jac = np.empty((self.ndata, self.nparams))
... var = self.x + c
... exponential = np.exp(-b*var**2)
... jac[:, 0] = exponential
... jac[:, 1] = -a*exponential*(var**2)
... jac[:, 2] = -a*exponential*2*b*var
... return jac
Let’s create some data to test this.
>>> x = np.linspace(0, 10, 1000)
>>> a, b, c = 100, 0.1, -2
>>> y = a*np.exp(-b*(x + c)**2)
For non-linear problems, we have to configure the optimization method. Lets use Levemberg-Marquardt because it generally offers good convergence.
>>> solver = GaussianFit(x, y).config('levmarq', initial=[1, 1, 1]).fit()
>>> # Print the estimated coefficients
>>> print(', '.join(['{:.1f}'.format(i) for i in solver.estimate_]))
100.0, 0.1, -2.0
>>> np.all(np.abs(solver.residuals()) < 10**-10)
True
We can use other optimization methods without having to re-implement our solution. For example, let’s see how well the Ant Colony Optimization for Continuous Domains (ACO-R) does for this problem:
>>> # bounds are the min, max values of the search domain for each parameter
>>> _ = solver.config('acor', bounds=[50, 500, 0, 1, -20, 0], seed=0).fit()
>>> print(', '.join(['{:.1f}'.format(i) for i in solver.estimate_]))
100.0, 0.1, -2.0
For non-linear problems, the Jacobian and Hessian are cached but not
permanently. Calling jacobian
twice in a row with the same parameter vector
will not trigger a computation and will return the cached value instead.
>>> A = solver.jacobian(np.array([1, 1, 1]))
>>> B = solver.jacobian(np.array([1, 1, 1]))
>>> A is B
True
But passing a different p
will trigger a computation and the cache will be
replaced by the new value.
>>> C = solver.jacobian(np.array([1, 1, 1.1]))
>>> A is C
False
>>> np.all(A == C)
False
>>> D = solver.jacobian(np.array([1, 1, 1.1]))
>>> D is C
True