nevopy.genetic_algorithm package¶
Submodules¶
nevopy.genetic_algorithm.config module¶
This module implements the FixedTopologyConfig
class, used to
handle the settings of NEvoPy’s fixed-topology neuroevolutionary algorithms.
-
class
nevopy.genetic_algorithm.config.
GeneticAlgorithmConfig
(file_pathname=None, **kwargs)¶ Bases:
object
Stores the settings to be used by
GeneticPopulation
.Individual configurations can be ignored (default values will be used), set in the arguments of this class constructor or written in a file (pathname passed as an argument).
Some parameters/attributes related to mutation chances expects a tuple with two floats, indicating the minimum and the maximum chance of the mutation occurring. A value within the given interval is chosen based on the “mass extinction factor” (mutation chances increases as the number of consecutive generations in which the population has shown no improvement increases). If you want a fixed mutation chance, just place the same value on both positions of the tuple.
Todo
Implementation: loading settings from a config file.
Specify the config file organization in the docs.
- Parameters
file_pathname (Optional[str]) – The pathname of a file from where the settings should be loaded.
**kwargs – Accepts any of the attributes listed for this class. When the value of an attribute isn’t passed as argument, a default value is used. The default values are defined in
GeneticAlgorithmConfig.ATTRIBUTES
.
-
mutation_chance
¶ Chance for a mutation to occur in a new-born genome.
- Type
Tuple[float, float]
-
weight_mutation_chance
¶ Chance of each individual connection weight of a newborn genome being perturbed during mutation.
- Type
Tuple[float, float]
-
weight_perturbation_pc
¶ Maximum absolute percentage of a weight’s value that can be added to it during mutation. When a connection weight is being mutated, it has a chance of being perturbed. This can me summarized as follows (p is the weight perturbation percentage): current weight <- current weight * (1 + random[-p, p]).
- Type
Tuple[float, float]
-
weight_reset_chance
¶ Chance, during mutation, for a weight to have its value reset (in which case a new random value is assigned to it).
- Type
Tuple[float, float]
-
new_weight_interval
¶ When a weight is reset, it will be assigned with a random value in this interval.
- Type
Tuple[float, float]
-
weak_genomes_removal_pc
¶ Percentage of the weakest genomes (those with the lowest fitness) to be removed before reproduction occurs.
- Type
float
-
mating_chance
¶ Chance of a genome reproducing sexually, i.e., by mating / crossing-over with another genome. Decreasing this value increases the chance of a genome reproducing asexually, through binary fission (copy + mutation).
- Type
float
-
mating_mode
¶ How the exchange of genetic material is supposed to happen during a sexual reproduction between two genomes. Options: “weights_mating” and “exchange_layers” (the new genome inherits full layers from its parents).
- Type
str
-
rank_prob_dist_coefficient
¶ Coefficient \(\alpha\) used to calculate the probability distribution used to select genomes for reproduction. Basically, the value of this constant can be interpreted as follows: the genome with the highest fitness has \(\times \alpha\) more chance of being selected for reproduction than the second best genome, which, in turn, has \(\times \alpha\) more chance of being selected than the third best genome, and so forth. This approach to reproduction is called rank-based selection.
- Type
float
-
predatism_chance
¶ Chance of a newborn genome being “predated”, in which case its replaced by a new randomly generated genome. This increases the genetic variability in the population.
- Type
float
-
species_distance_threshold
¶ Minimum distance between two genomes for them to be considered as being of the same species. A lower threshold will make new species easier to appear, increasing the number of species throughout the evolutionary process.
- Type
float
-
species_elitism_threshold
¶ Species with a number of members superior to this threshold will have one or more of their fittest members copied unchanged to the next generation.
- Type
int
-
elitism_pc
¶ Percentage of the genomes of a big enough species to be copied unchanged to the next generation. The number of copied genomes is equal to max(1, ceil(species_size * elitism_pc)).
- Type
float
-
species_no_improvement_limit
¶ If a species doesn’t show improvement in its best fitness for this amount of generations, it will be removed from the species’ list of the population.
- Type
int
-
mass_extinction_threshold
¶ If the population’s fitness doesn’t improve for this amount of generations, the whole population, with the exception of its fittest genome, will be extinct/deleted and replaced by new randomly generated genomes. Here the fitness of a population in a given generation is considered to be equal to the fitness of its fittest genome in that generation. As the number of generations without improvements increases, the mutations chances (as specified in the settings) also increase. This simulates the increase of the evolutionary pressure acting on the population.
- Type
int
-
maex_improvement_threshold_pc
¶ It’s considered that the fitness of a population improved if, and only if, the population’s fitness had an increase equivalent to this percentage. As an example, suppose that the fitness \(f_g\) of a population on generation \(g\) is 100 and that this parameter is set to 0.05 (5%). The fitness \(f_{g+1}\) of the population in the next generation (g + 1) is considered to have improved if, and only if, \(f_{g+1} \geq 1.05 \cdot f_g = 105\).
- Type
float
-
ATTRIBUTES
= {'elitism_pc': 0.03, 'interspecies_mating_chance': 0.05, 'maex_improvement_threshold_pc': 0.03, 'mass_extinction_threshold': 15, 'mating_chance': 0.7, 'mating_mode': 'weights_mating', 'mutation_chance': (0.6, 0.9), 'new_weight_interval': (-2, 2), 'predatism_chance': 0.1, 'rank_prob_dist_coefficient': 1.75, 'species_distance_threshold': 1.75, 'species_elitism_threshold': 5, 'species_no_improvement_limit': 15, 'weak_genomes_removal_pc': 0.5, 'weight_mutation_chance': (0.5, 1), 'weight_perturbation_pc': (0.05, 0.5), 'weight_reset_chance': (0.05, 0.5)}¶ Attributes supported by the class and their default values. Each attribute can passed as a kwarg in the class’ constructor or be specified in a config file. Attributes not specified will be initialized with a default value.
-
MAEX_KEYS
= {'mutation_chance', 'weight_mutation_chance', 'weight_perturbation_pc', 'weight_reset_chance'}¶ Name of the attributes whose values change according to the mass extinction counter (type: Tuple[float, float]).
-
property
maex_counter
¶ Returns the current value stored in the config’s mass extinction counter.
- Return type
int
-
update_mass_extinction
(maex_counter)¶ Updates the mutation chances based on the current value of the mass extinction counter (generations without improvement).
- Parameters
maex_counter (int) – Current value of the mass extinction counter (generations without improvement).
- Return type
None
nevopy.genetic_algorithm.population module¶
Implements a generalizable genetic algorithm that can be used by different neuroevolution algorithms.
-
class
nevopy.genetic_algorithm.population.
DefaultSpecies
(creation_gen, members=None)¶ Bases:
object
Represents a species.
In the context of a genetic algorithm, a species is a set of similar (to some extent) genomes that can mate in order to generate offspring.
- Parameters
creation_gen (int) – Number of the generation in which the species is being created.
members (Optional[List[BaseGenome]]) – Initial members of the species.
-
representative
¶ Genome used to represent the species.
- Type
Optional[BaseGenome]
-
members
¶ List with the genomes that belong to the species.
- Type
List[BaseGenome]
-
last_improvement
¶ Generation in which the species last showed improvement of its fitness. The species fitness in a given generation is equal to the fitness of the species fittest genome on that generation.
- Type
int
-
best_fitness
¶ The last calculated fitness of the species fittest genome.
- Type
Optional[float]
-
avg_fitness
()¶ Returns the average fitness of the species genomes.
- Return type
float
-
compatibility
(genome)¶ Returns a float indicating the compatibility of the given genome with the species.
- Return type
float
-
fittest
()¶ Returns the fittest member of the species.
- Return type
-
update_representative
()¶ Chooses a new representative for the species.
This implementation follows NEAT, so a random member of the species is chosen as its representative.
- Return type
None
-
class
nevopy.genetic_algorithm.population.
GeneticPopulation
(size, base_genome, config=None, processing_scheduler=None, speciation=False)¶ Bases:
nevopy.base_population.BasePopulation
Implementation of a generalizable genetic algorithm.
This class implements a generalizable genetic algorithm that can be used by different neuroevolution algorithms. The algorithm is used to evolve a population of genomes (instances of a subclass of
BaseGenome
).This class does not make strong assumptions about the type of genome it is dealing with, so it does not take into account the type of encoding the genome uses or how it processes input. This allows the implemented algorithm to be used in a wide range of scenarios.
The implemented genetic algorithm uses (optionally) a speciation scheme similar to the one used by the NEAT algorithm [SM02]. The computation of the distance between the genomes, however, is not implemented here, but on the subclass that implements class:.BaseGenome.
When subclassing this class, you probably won’t need to override the
GeneticPopulation.evolve()
method, which contains the main loop of the genetic algorithm.To better understand the default behaviour of the algorithm, it’s recommended to read the docs of the methods
speciate()
andreproduction()
.Example
Example using
FixedTopologyGenome
as the base genome type:def fitness_func(genome): """ Function that takes a genome as input and returns the genome's fitness (a float) as output. """ # ... # Genome that's gonna serve as a model for your population: base_genome = FixedTopologyGenome( layers=[TFDenseLayer(32, activation="relu"), TFDenseLayer(1, activation="sigmoid")], input_shape=my_input_shape, # shape of your input samples ) # Creating and evolving a population: population = GeneticPopulation(size=100, base_genome=base_genome) history = population.evolve(generations=100, fitness_function=fitness_func) # Visualizing the evolution of the population's fitness: history.visualize() # Retrieving and visualizing the fittest genome of the population: best_genome = population.fittest() best_genome.visualize()
- Parameters
size (int) – Number of genomes (constant) in the population.
base_genome (BaseGenome) – Instance of a subclass of
BaseGenome
that will serve as a model/base for all the population’s genomes.config (Optional[GeneticConfig]) – The settings of the evolutionary process. If None, the default settings will be used.
processing_scheduler (Optional[ProcessingScheduler]) – Processing scheduler to be used by the population. If None, a new instance of
RayProcessingScheduler
will be used as scheduler.speciation (bool) – Whether the genetic algorithm used to evolve the genomes should use speciation or not.
-
DEFAULT_SCHEDULER
¶ alias of
nevopy.processing.ray_processing.RayProcessingScheduler
-
property
config
¶ Config object that stores the settings used by the population.
-
evolve
(generations, fitness_function, callbacks=None, verbose=2, **kwargs)¶ Evolves the population using a genetic algorithm.
Main method of this class. It contains the main loop of the genetic algorithm used to evolve the population of genomes.
- Parameters
generations (int) – Number of generations for the algorithm to run. A generation is completed when all the population’s genomes have been processed and reproduction and speciation have occurred.
fitness_function (Callable[[BaseGenome], float]) – Fitness function to be used to evaluate the fitness of individual genomes. It must receive a genome as input and produce a float (the genome’s fitness) as output.
callbacks (Optional[List[Callback]]) – List with instances of
Callback
that will be called during the evolutionary session. By default, aHistory
callback is always included in the list. ACompleteStdOutLogger
or aSimpleStdOutLogger
might also be included, depending on the value passed to theverbose
param.verbose (int) – Verbose level (logging on stdout). Options: 0 (no verbose), 1 (light verbose) and 2 (heavy verbose).
- Return type
- Returns
A
History
object containing useful information recorded during the evolutionary process.
-
static
generate_offspring
(args)¶ Given one or two genomes (parents), generates a new genome.
- Parameters
args (Tuple[BaseGenome, Optional[BaseGenome], bool]) – Tuple containing a genome in its first index, another genome or None in its second index and a bool in its third index. The bool indicates whether predatism will occur or not. If it’s True, then the new genome will be randomly generated. If the second index is another genome, then the new genome will be generated by mating the two given genomes (sexual reproduction). If its None, the new genome will be a mutated copy (asexual reproduction / binary fission) of the genome in the first index.
- Return type
- Returns
A new genome.
-
mass_extinction
(best_genome)¶ All the genomes in the population (except for the best genome) are replaced by new random genomes (random copies of the population’s base genome).
- Return type
None
-
reproduction
()¶ Handles the reproduction of the population’s genomes.
First, the fittest genomes of each species with more than a pre-defined number of individuals are selected to be copied unchanged to the next generation (elitism). Next, the least fit genomes of each species are discarded (reverse elitism). After that, the number of descendants of each species is calculated. The number of offspring assigned to each species is proportional to the average fitness of the species (roulette wheel selection). Finally, the reproduction of individuals of the same species (and, on rare occasions, between genomes of different species as well) occurs.
Genomes with a higher fitness have a higher chance of leaving offspring. Within a species, the chance of a genome reproducing is given by the position it occupies in the species fitness rank (rank-based selection). This means that the reproduction chance of a genome is not directly calculated from the genome’s fitness, but rather from how well positioned is the genome in the fitness rank.
Some of the behaviour described above follows the original description of the NEAT algorithm [SM02].
Newborn genomes have a chance of being “eaten by a predator”, in which case they are replaced by new randomly generated genomes. This technique is called predatism.
- Return type
int
- Returns
Number of preys (individuals replaced by a random genome).
-
speciate
(current_generation)¶ Divides the population’s genomes into species.
The algorithm follows the speciation scheme of the NEAT algorithm [SM02]:
“Each existing species is represented by a random genome inside the species from the previous generation. A given genome g in the current generation is placed in the first species in which g is compatible with the representative genome of that species. This way, species do not overlap. If g is not compatible with any existing species, a new species is created with g as its representative.” - [SM02]
The degree of compatibility between two genomes is given by their distance, calculated by the
BaseGenome.distance()
method. The lower the distance the more compatible two genomes are. Two genomes are considered compatible if their distance is lower than a pre-defined number (GeneticAlgorithmConfig.species_distance_threshold
).Species that haven’t improved their fitness for a pre-defined number of generations are extinct, i.e., they are removed from the population and aren’t considered for the speciation process.
- Return type
None
Module contents¶
Imports core names of nevopy.genetic_algorithms
.