Show simple item record

dc.contributor.authorOhEigeartaigh, Sean S
dc.contributor.authorArmisen, David
dc.contributor.authorByrne, Kevin P
dc.contributor.authorWolfe, Kenneth H
dc.date.accessioned2011-08-29T14:52:43Z
dc.date.available2011-08-29T14:52:43Z
dc.date.issued2011-07-26
dc.identifierhttp://dx.doi.org/10.1186/1471-2164-12-377
dc.identifier.citationBMC Genomics. 2011 Jul 26;12(1):377
dc.identifier.urihttp://hdl.handle.net/10147/141157
dc.description.abstractAbstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external information has been added may prove useful in other settings.
dc.titleSystematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments
dc.typeJournal Article
dc.language.rfc3066en
dc.rights.holderOhEigeartaigh et al.; licensee BioMed Central Ltd.
dc.description.statusPeer Reviewed
dc.date.updated2011-08-26T13:23:48Z
refterms.dateFOA2018-08-22T13:39:06Z
html.description.abstractAbstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external information has been added may prove useful in other settings.


Files in this item

Thumbnail
Name:
1471-2164-12-377.xml
Size:
96.34Kb
Format:
XML
Thumbnail
Name:
1471-2164-12-377.pdf
Size:
1.096Mb
Format:
PDF
Thumbnail
Name:
1471-2164-12-377-S1.DOC
Size:
597Kb
Format:
Microsoft Word

This item appears in the following Collection(s)

Show simple item record