GiSA: A Grid System for Genome Sequences Assembly

0
61

Authors: Baile Shi, Chen Wang, Dong Huang, Jun Tang, Wei Wang

Tags: 2004, conceptual modeling

Sequencing genomes is a fundamental aspect of biological research. Shotgun sequencing, since introduced by Sanger et al [2], has remained the mainstay in the research field of genome sequence assembly. This method randomly obtains sequence reads (e.g. a subsequence including about 500 characters) from a genome and then assemblies them into contigs based on significant overlap among them. The whole-genome shotgun (WGS) approach, generates sequence reads directly from a whole-genome library and uses computational techniques to reassemble them. A variety of assembly programs have been previously proposed and implemented, including PHRAP [3] (Green 1994), CAP3 [4] (1999), Celera [5] (2000) etc. Because of great computational complexity and increasingly large size, they incur great time and space overhead. PHRAP [3], for instance, which can only run in a stand-alone way, requires many times memory (usually greater than 10) as the size of original sequence data. In realistic applications, sequencing process might come to become unacceptably slow for insufficient memory even with a mainframe with huge RAM.

Read the full paper here: https://link.springer.com/chapter/10.1007/978-3-540-30464-7_63