Thread-cooperative, bit-parallel computation of Levenshtein distance on GPU
Chacón, Alejandro (Universitat Autònoma de Barcelona)
Marco-Sola, Santiago (Centre Nacional d'Anàlisi Genòmica)
Date: |
2014 |
Abstract: |
Approximate string matching is a very important problem in computational biology; it requires the fast computation of string distance as one of its essential components. Myers' bit-parallel algorithm improves the classical dynamic programming approach to Levenshtein distance computation, and offers competitive performance on CPUs. The main challenge when designing an efficient GPU implementation is to expose enough SIMD parallelism while at the same time keeping a relatively small working set for each thread. In this work we implement and optimise a CUDA version of Myers' algorithm suitable to be used as a building block for DNA sequence alignment. We achieve high efficiency by means of a cooperative parallelisation strategy for (1) very-long integer addition and shift operations, and (2) several simultaneous pattern matching tasks. In addition, we explore the performance impact obtained when using features specific to the Kepler architecture. Our results show an overall performance of the order of tera cells updates per second using a single high-end Nvidia GPU, and factor speedups in excess of 20 with respect to a sixteen-core, non-vectorised CPU implementation. |
Grants: |
Ministerio de Ciencia e Innovación TIN2011-28689-C02-01
|
Rights: |
Tots els drets reservats. |
Language: |
Anglès |
Document: |
Comunicació de congrés |
Subject: |
SIMD ;
GPU ;
CUDA ;
Myers' algorithm |
Published in: |
ICS : International Conference on Supercomputing. Munic, Alemanya, : 2014 |
DOI: 10.1145/2597652.2597677
Post-print
10 p, 646.3 KB
|
The record appears in these collections:
Contributions to meetings and congresses >
Papers and communications >
UAB papers and communications
Record created 2015-04-28, last modified 2022-06-27