2020-10-282020-10-282019-12-02GAIA, Antonio Sérgio Cruz. Ferramenta baseada em cuckoo filter para remoção de redundância em dados de sequenciadores de segunda geração (NGS - next generation sequencing). Orientador: Adonney Allan de Oliveira Veras. 2019. 108 f. Dissertação (Mestrado em Computação Aplicada) - Núcleo de Desenvolvimento Amazônico em Engenharia, Universidade Federal do Pará, Tucuruí, 2019. Disponível em: https://repositorio.ufpa.br/jspui/handle/2011/12798. Acesso em:.https://repositorio.ufpa.br/handle/2011/12798The second-generation sequencing platforms, also known as NGS – Next Generation Sequencing, produce a great amount of data, which demands high complexity and computational cost in the processing of these data. These platforms generate duplicated reads that come from the preparation of the genomic library and are included in the amplification stage by PCR (Polymerase Chain Reaction). This redundancy can increase the computational requirements and processing time of subsequent analyses (for instance, de novo assembly). To reduce the computational cost of theses analyses, it is necessary to remove these reads from the data set of the sequenced organism. In this work, we present the NGSReadsTreatment, a computational tool to remove duplicated reads in paired-end or single-end data sets. The input for NGSReadsTreatment consists of reads from any sequencing platform with same or different read lengths. Its engine uses a Cuckoo Filter probabilistic structure to identify and remove redundant readings. The identification is done by comparing the reads among themselves, this way, not any pre-requisite is necessary besides the reads set. The validation of the tool was carried out by using a set of real and simulated data. To assess the efficiency of the tool, it was compared to other tools of redundancy removal. The results indicate the efficiency of the NGSReadsTreatment, for it produced the best outcome, both in the number of redundancies removed and the use of memory, in all tests done. Developed in JAVA, the NGSReadsTreatment is compatible with UNIX/Linux and Windows operating systems and has a version with a graphic interface to facilitate its use.Acesso Abertohttp://creativecommons.org/licenses/by-nc-nd/3.0/br/Software - DesenvolvimentoEstruturas de dados (Computação)Cuckoo FilterRedundância de stringsSequenciamento de DNAFerramenta baseada em cuckoo filter para remoção de redundância em dados de sequenciadores de segunda geração (NGS - next generation sequencing)DissertaçãoCNPQ::ENGENHARIASDESENVOLVIMENTO DE SISTEMASCOMPUTAÇÃO APLICADA