International Journal of applied mathematics and computer science

online read us now

Paper details

Number 3 - September 2012
Volume 22 - 2012

KIS: An automated attribute induction method for classification of DNA sequences

Rafał Biedrzycki, Jarosław Arabas

Abstract
This paper presents an application of methods from the machine learning domain to solving the task of DNA sequence recognition. We present an algorithm that learns to recognize groups of DNA sequences sharing common features such as sequence functionality. We demonstrate application of the algorithm to find splice sites, i.e., to properly detect donor and acceptor sequences. We compare the results with those of reference methods that have been designed and tuned to detect splice sites. We also show how to use the algorithm to find a human readable model of the IRE (Iron-Responsive Element) and to find IRE sequences. The method, although universal, yields results which are of quality comparable to those obtained by reference methods. In contrast to reference methods, this approach uses models that operate on sequence patterns, which facilitates interpretation of the results by humans.

Keywords
classification, optimization, annotation, patterns

DOI
10.2478/v10006-012-0053-2