Recurrent Models for Drug Generation

Carvalho, Angélica Santos

Utilize este identificador para referenciar este registo: https://hdl.handle.net/10316/87303

Título:	Recurrent Models for Drug Generation
Outros títulos:	Modelos Recorrentes para Geração de Fármacos
Autor:	Carvalho, Angélica Santos
Orientador:	Arrais, Joel Perdiz
Palavras-chave:	Drug Discovery; Deep Learning; Modelos Recurrentes; Validação; Fragmentation Growing Procedure; Drug Discovery; Deep Learning; Recurrent Models; Validation; Fragmentation Growing Procedure
Data:	15-Jul-2019
Título da revista, periódico, livro ou evento:	Recurrent Models for Drug Generation
Local de edição ou do evento:	DEI-FCTUC
Resumo:	A descoberta de medicamentos visa identificar potenciais novos medicamentos através de um processo multidisciplinar, incluindo várias áreas científicas, como a biologia, a química e a farmacologia. Atualmente, múltiplas estratégias e metodologias têm sido desenvolvidas para descobrir, testar e otimizar novos medicamentos. No entanto, há um longo processo que vai desde a identificação de alvos até uma molécula comercializável. O objetivo principal desta dissertação é desenvolver um modelo computacional capaz de propor novos compostos. Para atingir este objetivo, foi explorado e treinado um modelo recorrente para gerar um novo Simplified molecular-input line-entry system (SMILES). As Artificial Neural Network (ANN) estudadas nesta dissertação foram Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) e Bidirectional Long-Short Term Memory (BLSTM). Um conjunto de dados consistente foi escolhido e os SMILES gerados pelo modelo foram sintática e bioquimicamente validados. Para restringir a geração de SMILES, foi utilizada uma técnica denominada Fragmentation Growing Procedure, onde é possível escolher um fragmento e gerar SMILES a partir dele. Para analisar a rede recorrente que melhor se ajusta e os respetivos parâmetros, foram realizados alguns testes e a rede contida no modelo que atingiu o melhor resultado, 98% SMILES válidos e 93% SMILES únicos, foi uma LSTM com 2 camadas. A técnica de restrição de geração foi utilizada no melhor modelo e atingiu 99% dos SMILES válidos e 79% dos SMILES únicos. Drug discovery aims to identify potential new medicines through a multidisciplinary process, including several scientific areas, such as biology, chemistry and pharmacology. Nowadays, multiple strategies and methodologies have been developed to discover, test and optimise new drugs. However, there is a long process from target identification to an optimal marketable molecule. The main purpose of this dissertation is to develop computational models able to propose new drug compounds. In order to achieve this goal, the artificial neural networks explored and trained to generate new drugs in the form of Simplified Molecular-Input Line-Entry System (SMILES). The explored neural networks model were Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) and Bidirectional Long-Short Term Memory (BLSTM). A consistent dataset was chosen, and the generated SMILES by the model were syntactically and biochemically validated. In order to restrict the generation of SMILES, a technique denominated Fragmentation Growing Procedure was used, where made it possible to choose a fragment and generate SMILES from that. To analyse the recurrent network that fits the best and the respective parameters, some tests were performed, and the network contained in the model that reached the best result, 98% of valid SMILES and 93% of unique SMILES, was an LSTM with two layers. The technique to restrict the generation was used in the best model and reached 99% of valid SMILES and 79% of unique SMILES.
Descrição:	Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia
URI:	https://hdl.handle.net/10316/87303
Direitos:	openAccess
Aparece nas coleções:	UC - Dissertações de Mestrado

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
tese_v2.0.pdf		3.11 MB	Adobe PDF	Ver/Abrir

Mostrar registo em formato completo

Visualizações de página 50

432

Visto em 23/jul/2024

Downloads 50

648

Visto em 23/jul/2024

Google Scholar^TM

Verificar

Este registo está protegido por Licença Creative Commons

Ficheiros deste registo:

Visualizações de página 50

Downloads 50

Google ScholarTM

Google Scholar^TM