Ensemble approach combining multiple methods improves human transcription start site prediction.
Affiliation
Complex and Adaptive Systems Laboratory (CASL), University College Dublin, Belfield, Dublin 4, Ireland. david.dineen@ucd.ieIssue Date
2010MeSH
Base PairingComputational Biology
Genome, Human
Humans
Principal Component Analysis
Promoter Regions, Genetic
Software
Transcription Initiation Site
Metadata
Show full item recordCitation
Ensemble approach combining multiple methods improves human transcription start site prediction. 2010, 11:677 BMC GenomicsJournal
BMC genomicsDOI
10.1186/1471-2164-11-677PubMed ID
21118509Abstract
The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets.We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier ('Profisi Ensemble') using predictions from 7 programs, along with 2 other data sources. Support vector machines using 'full' and 'reduced' data sets are combined in an either/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool.
Supervised learning methods are a useful way to combine predictions from diverse sources.
Item Type
ArticleLanguage
enISSN
1471-2164ae974a485f413a2113503eed53cd6c53
10.1186/1471-2164-11-677
Scopus Count
Collections
Related articles
- EnsemPro: an ensemble approach to predicting transcription start sites in human genomic DNA sequences.
- Authors: Won HH, Kim MJ, Kim S, Kim JW
- Issue date: 2008 Mar
- Boosting with stumps for predicting transcription start sites.
- Authors: Zhao X, Xuan Z, Zhang MQ
- Issue date: 2007
- Computational detection and location of transcription start sites in mammalian genomic DNA.
- Authors: Down TA, Hubbard TJ
- Issue date: 2002 Mar
- Stepwise approach for combining many sources of evidence for site-recognition in genomic sequences.
- Authors: Pérez-Rodríguez J, García-Pedrajas N
- Issue date: 2016 Mar 5
- iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species.
- Authors: Zhang P, Zhang H, Wu H
- Issue date: 2022 Oct 14