An Exploration of Generalization and Overfitting in Genetic Programming: Standard and Geometric Semantic Approaches

Gonçalves, Ivo Carlos Pereira

Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/32725

Title:	An Exploration of Generalization and Overfitting in Genetic Programming: Standard and Geometric Semantic Approaches
Authors:	Gonçalves, Ivo Carlos Pereira
Orientador:	Fonseca, Carlos Silva, Sara
Keywords:	Evolutionary Computation; Genetic Programming; Geometric Semantic Genetic Programming; Supervised Learning; Generalization; Overfitting; Neural Networks
Issue Date:	22-Feb-2017
Citation:	GONÇALVES, Ivo Carlos Pereira - An exploration of generalization and overfitting in genetic programming : standard and geometric semantic approaches. Coimbra : [s.n.], 2017. Tese de doutoramento. Disponível na WWW: http://hdl.handle.net/10316/32725
Project:	info:eu-repo/grantAgreement/FCT/SFRH/SFRH/BD/79964/2011/PT
Abstract:	Computational learning refers to the task of inducing a general pattern from a provided set of examples. A learning method is expected to generalize to unseen examples of the same pattern. A common issue in computational learning is the possibility that the resulting models could be simply learning the provided set of examples, instead of learning the underlying pattern. A model that is incurring in such a behavior is commonly said to be overfitting. This dissertation explores the task of computational learning and the related concepts of generalization and overfitting, in the context of Genetic Programming (GP). GP is a computational method inspired by natural evolution that considers a set of primitive functions and terminals that can be combined without any considerable constraints on the structure of the models being evolved. This flexibility can help in learning complex patterns but it also increases the risk of overfitting. The contributions of this dissertation cover the most common form of GP (Standard GP), as well as the recently proposed Geometric Semantic GP (GSGP). The initial set of approaches relies on dynamically selecting different training data subsets during the evolutionary process. These approaches can avoid overfitting and improve the resulting generalization without restricting the flexibility of GP. Besides improving the generalization, these approaches also produce considerably smaller individuals. An analysis of the generalization ability of GSGP is performed, which shows that the generalization outcome is greatly dependent on particular characteristics of the mutation operator. It is shown that, as Standard GP, the original formulation of GSGP is prone to overfitting. The necessary conditions to avoid overfitting are presented. When such conditions are in place, GSGP can achieve a particularly competitive generalization. A novel geometric semantic mutation that substantially improves the effectiveness and efficiency of GSGP is proposed. Besides considerably improving the training data learning rate, it also achieves a competitive generalization with only a few applications of the mutation operator. The final set of contributions covers the domain of Neural Networks (NNs). These contributions originated as an extension of the research conducted within GSGP. This set of contributions includes the definition of a NN construction algorithm based on an extension of the mutation operator defined in GSGP. Similarly to GSGP, the proposed algorithm searches over a space without local optima. This allows for an effective and efficient stochastic search in the space of NNs, without the need to use backpropagation to adjust the weights of the network. Finally, two search stopping criteria are proposed, which can be directly used in the proposed NN construction algorithm and in GSGP. These stopping criteria are able to detect when the risk of overfitting increases significantly. It is shown that the stopping points detected result in a competitive generalization. Aprendizagem computacional refere-se à tarefa de induzir um padrão general através de um conjunto de exemplos. Espera-se que um método de aprendizagem consiga generalizar para exemplos não vistos do mesmo padrão. Um problema comum em aprendizagem computacional é a possibilidade de os modelos resultantes aprenderem simplesmente o conjunto de exemplos dado, em vez de aprender o padrão subjacente. Quando um modelo tem este comportamento, diz-se que ele está em sobreajustamento. Esta dissertação explora a tarefa de aprendizagem computacional e os conceitos relacionados de generalização e sobreajustamento, no contexto da Programação Genética (PG). PG é um método computacional inspirado pela evolução natural que considera um conjunto de funções primitivas e de terminais, que podem ser combinados sem nenhuma restrição considerável em relação à estrutura dos modelos que estão a ser evoluídos. Esta flexibilidade pode ajudar a aprendizagem de padrões complexos mas também aumenta o risco do sobreajustamento. As contribuições desta dissertação cobrem a forma mais comum de PG (Standard PG), assim como a recentemente proposta Programação Genética Geométrica Semântica (PGGS). O conjunto de abordagens iniciais é baseado numa seleção dinâmica de diferentes subconjuntos dos dados de treino durante o processo evolucionário. Estas abordagens conseguem evitar o sobreajustamento e melhorar a generalização resultante sem restringir a flexibilidade da PG. É realizada uma análise da capacidade de generalização da PGGS, que demonstra que a generalização resultante é consideravelmente dependente das características do operador de mutação. É demonstrado que, tal como Standard PG, a formulação original de PGGS tem tendência a sobreajustar. São apresentadas as condições necessárias para evitar o sobreajustamento. Quando essas condições se verificam, PGGS consegue alcançar uma generalização particularmente competitiva. É proposta uma nova mutação geométrica semântica que melhora de forma substancial a eficácia e a eficiência da PGGS. Além de melhorar consideravelmente o ritmo de aprendizagem dos dados de treino, também consegue alcançar uma generalização competitiva com apenas algumas aplicações do operador de mutação. O conjunto final de contribuições cobre o domínio das Redes Neuronais (RNs). Estas contribuições resultaram de uma extensão da investigação realizada em PGGS. Este conjunto de contribuições inclui a definição de um algoritmo de construção de RNs baseado numa extensão do operador de mutação da PGGS. De forma semelhante à PGGS, o algoritmo proposto pesquisa sobre um espaço sem óptimos locais. Isto permite realizar uma pesquisa estocástica eficaz e eficiente no espaço das RNs, sem existir a necessidade de usar retropropagação para ajustar os pesos da rede. Finalmente, são propostos dois critérios de paragem de pesquisa que conseguem detectar quando o risco de sobreajustamento sobe consideravelmente. É mostrado que os pontos de paragem detectados resultam numa generalização competitiva.
Description:	Tese de doutoramento em Ciências e Tecnologias da Informação, apresentada ao Departamento de Engenharia Informática da Faculdade de Ciências e Tecnologia da Universidade de Coimbra
URI:	https://hdl.handle.net/10316/32725
Rights:	openAccess
Appears in Collections:	FCTUC Eng.Informática - Teses de Doutoramento

Files in This Item:

File	Description	Size	Format
An Exploration of Generalization and Overfitting in Genetic Programming.pdf		10.77 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM