Software

SENSE: Siamese neural network for amplicon sequence embedding

The rapid development of sequencing technology has led to an explosive accumulation of genomic data. In general, sequence comparison is a fundamental component of sequence analysis. However, alignment-based methods for sequence analysis have various limitations when analyzing large-scale datasets and existing alignment-free methods mainly rely on handcrafted features. In this paper, we introduce a new approach named SiamEse Neural Sequence Embedding (SENSE) for efficient and accurate alignment-free sequence comparison. To evaluate the performance of our approach, we applied it to two different real world large-scale sequence datasets. By directly comparing with the alignment method, we demonstrated that the proposed method outperformed the state-of-the-art alignment-free methods in terms of both accuracy and efficiency.

Source code and documentation

Manuscript