Genetic algorithms in joint inversion
I am currently working on developing a joint inversion method for receiver-function and MT-data. For this I am using a genetic algorithm. This page explains the basic workings of the method. For more detailed information look at the presentations on my publications page or visit some of the links.

Genetic algorithms as an optimization method

Genetic algorithms (GA) are a class of optimization algorithms that use techniques similar to biological evolution to solve optimization problems. Similar to Monte Carlo optimization they use a stochastic approach, but the encoding of the optimization parameters in a DNA-like fashion allows them to exchange information between different models in consideration. This usually increases the convergence rate, resulting in less objective function evaluations.

GA flow diagram The usual flow of a genetic algorithm is shown on the left. The individual steps can be performed in a number of ways, depending on the requirements of the user and the optimization problem at hand.

The first and very important step is to chose the way the optimization parameters are encoded. In a lot of cases they are represented by a string of 1 and 0, similar to the usual representation of integers but with a user controllable range and discretization step for each parameter. All operations of the GA are performed on this representation, giving an abstraction from the individual parameters.

After a representation has been chosen, the algorithm randomly generates a number of DNA strings of the required length, the so called population of a chosen size N. This population is the basis for the rest of the algorithm. For each member of the population an objective function value is calculated by transforming their DNA string back to the optimization parameters and evaluating the objective function with those parameters. After all objective function evaluations have been performed, these values are transformed into a probability. Population members with a low objective function value (for a minimization problem) receive high probabilities and vice versa. Based on this probability members are selected for the next iteration.

At the beginning of each iteration a new population of the same size N is created by choosing members from the last generation according to the calculated probability. Duplicates of members are possible. If this was the only step in the iteration there would be no innovation. Therefore two steps are introduced to create new members.

Crossover is a systematic exchange of information. With a chosen probability two models exchange their DNA after a chosen point. This creates new members that have some similarity with the old population members. The idea is that combining characteristics of two good models has a good chance of yielding an even better model.

Mutation on the other hand is completely unsystematic, but can introduce new members into the population. With a chosen probability one of the bits in the parameter representation changes its value. The mutation probability is usually chosen fairly low, so that only about 10 percent of the population undergo mutation.

Joint inversion

Joint inversion or multi-objective optimization means the simultaneous minimization (or maximization) of several objective functions. These type of problems occur in many areas such as economics. engineering and physics. In geophysics joint inversion usually means finding a model that explains several data sets at once.

Joint inversion of EM and seismic data

MT data example For my PhD I develop a method to jointly invert magnetotelluric and receiver function data. On the left you can see an example of high-quality magnetotelluric data from Ireland. The frequency plotted on the x-Axis is a proxy for depth, while the apparent resistivity in the upper panel provides a rough indication of resistivity at those depths. To obtain a better knowledge of the distribution of conductivity with depth we have to solve the inverse problem.

In a similar fashion receiver function data give us information about the shear-wave speed and interfaces in the subsurface. An example of a receiver function is plotted below. Often receiver functions are used as an imaging technique to visualize velocity changes, but we can obtain more detailed information by using an inversion algorithm to construct a model.

Rec func data example

Traditionally electromagnetic data, such as MT, and seismic data, e.g. receiver functions, have been modelled and inverted independently. They are sensitive to different parameters, resistivity and shear-wave velocity, respectively and therefore not necessarily sensitive to the same structures. If on the other hand electrical structure and velocity structure in the subsurface are the same, joint inversion can improve our understanding reducing the number of acceptable models and yielding information about more physical parameters.

In the case of joint inversion of data that is sensitive to different parameters we try to solve two problems: Find a model that explains both datasets and asses whether our assumption of correlated structures makes sense.

Special advantages of GA's in joint inversion

In principle a multi-objective optimization problem can be transformed to a conventional optimization problem by optimizing a weighted sum of the individual objective functions. This is routinely done in regularized inversion where data misfit and model complexity are minimized simultaneously. This example illustrates one of the problems with this approach. The weighting factor is not know a-priori and most researchers perform a number of inversion with different weight to obtain the so called L-curve.

The idea of Pareto optimality

Genetic algorithms can overcome the weighting problem by using the concept of Pareto-optimality and domination to compare the fitness of their population members. One member is said to dominate another member if all objective function values are less or equal and at least one objective function value is less than the other member's (for a minimization problem). A member is called Pareto-optimal if it is not dominated by any other member. Using this criterion there is no longer a single optimum member, but a set of members that are all regarded equally fit and display the trade-off between optimizing the objective function. An example for regularized MT inversion is shown below. For different iterations of the GA we show the non-dominated front, i.e the models the algorithm considers best, and the result is an L shaped curve typical for this kind of problem.

Rec func data example

Some of my recent experiments with synthetic and real data indicate that the shape of the trade-off curve can be used to characterize data compatibility. If all objective functions have the same minimum, the population member representing the parameter values of this minimum dominates all other members. Thus the result of the algorithm should be a single model. If an L-shaped curve, or even an inverted L-shape occurs, it can be taken as an indicator that the minima do not coincide or the objective functions are not compatible. For the geologic interpretation this means that electric and seismic structures differ. Practical aspects make the reliable evaluation of the trade-off difficult and I am currently investigating this area.