3.4 Genetic Algorithm: genal.shell2D.sh

This program will perform a search to find the members of the feasible set of solutions. In addition to name.hkl and name.ins it requires one other file called control. It is called by:

genal.shell2D.sh name

It will output three files:

calculated.log A record of all the values calculated, their FOMs and the initial values of the phases.

calc2.log Similar to calculated.log, except with the final values of the phases.

solutions The best initial values (encoded as bits) found to date.

Two small programs sort.sh and xfom are included to help analyze these files. The code sort.sh will collate the top 300 unique solutions from calc2.log and calculated.log putting them, respectively, into files called sorted and sortedc. The program xfom can be used to look at the solutions file and find the solutions which are uniquely different by some factor (e.g. 0.05 or 0.1) in terms of a robust correllation.

The control file is normaly setup by the Divergence analysis div2D, although you can change the parameters yourself if you really want to. The format of the control file is:

Line 1: Number of reflections (N) to permute - format i4

Line 2-N+1: Number of Bits, Low Angle, High angle formatted as i4, f10.6, X3, f10.6

Note: the maximum number of bits per angle is defined in Params.h within the directories Genetic and Includes, and is currently set to 4.

Warning: the word written below in bold italics is required

Line N+2: Number of individuals to use, nindiv - ignored in current version of genal.shell2D.sh

Line N+3: Seed (negative integer) for random number generator, idum used in the first pass only

Line N+4: Number of populations, npop - total number of generations to run

Line N+5: Number of children, nchild - number of children (currently = nindiv)

Line N+6: Mutation rate, mrate. One in every mrate bits will mutate, normaly equal to the number of bits.

Line N+7: Survivors, nsurv - Number of best solutions reintroduced every other population

Line N+8: Tension, tension - the larger this is, the more low FOM solutions are prefered

Line N+9: Seed number, seed - number of good final solutions to reintroduce every other population

Line N+10: Solutions, nsave - total number of solutions to keep.

The actual algorithm used is rather complicated, and runs in parallel three different subpopulations with slightly different methods of producing the children used for each subpopulation. The best solutions are cross-fertilized between the different populations, and steps are taken to try and ensure that there is sufficient diversity at all times, avoiding what is called premature convergence. This is done using what is called "sharing" whereby very similar solutions are weighted down when it comes to producing children, but at the same time parents are chosen such that they are relatively similar, called "niche specialization".

Back