Please use this identifier to cite or link to this item:
Title: An improved nearest neighbour algorithm for DNA sequence clustering
Authors: Chen, Guizhen
Keywords: Science::Mathematics
Issue Date: 2020
Publisher: Nanyang Technological University
Abstract: This paper explores clustering algorithms to construct a phylogenetic tree, based on distance measures such as the 18-dimensional vector distance. The main objective of the study is to investigate the distance-based methods to construct a phylogenetic tree and design an efficient and accurate algorithm to reduce the computational time. We analyse the time complexity of the UPGMA method and introduce our approach with the idea of ball tree. We perform the analysis on actual datasets including the filoviruses, influenza viruses and bacterial genomes. Generally, the new approach is able to separate the species well and produce better results than the original ball tree method. The computational time is also greatly reduced by the modified ball tree structure.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SPMS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
944.07 kBAdobe PDFView/Open

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.