• Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

      Ivanov, Ivaylo P; Firth, Andrew E; Michel, Audrey M; Atkins, John F; Baranov, Pavel V; BioSciences Institute, University College Cork, Cork, Ireland. iivanov@genetics.utah.edu (2011-05)
      In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.
    • Infrequent detection of germline allele-specific expression of TGFBR1 in lymphoblasts and tissues of colon cancer patients.

      Guda, Kishore; Natale, Leanna; Lutterbaugh, James; Wiesner, Georgia L; Lewis, Susan; Tanner, Stephan M; Tomsic, Jerneja; Valle, Laura; de la Chapelle, Albert; Elston, Robert C; et al. (2009-06-15)
      Recently, germline allele-specific expression (ASE) of the gene encoding for transforming growth factor-beta type I receptor (TGFBR1) has been proposed to be a major risk factor for cancer predisposition in the colon. Germline ASE results in a lowered expression of one of the TGFBR1 alleles (>1.5-fold), and was shown to occur in approximately 20% of informative familial and sporadic colorectal cancer (CRC) cases. In the present study, using the highly quantitative pyrosequencing technique, we estimated the frequency of ASE in TGFBR1 in a cohort of affected individuals from familial clusters of advanced colon neoplasias (cancers and adenomas with high-grade dysplasia), and also from a cohort of individuals with sporadic CRCs. Cases were considered positive for the presence of ASE if demonstrating an allelic expression ratio <0.67 or >1.5. Using RNA derived from lymphoblastoid cell lines, we find that of 46 informative Caucasian advanced colon neoplasia cases with a family history, only 2 individuals display a modest ASE, with allelic ratios of 1.65 and 1.73, respectively. Given that ASE of TGFBR1, if present, would likely be more pronounced in the colon compared with other tissues, we additionally determined the allele ratios of TGFBR1 in the RNA derived from normal-appearing colonic mucosa of sporadic CRC cases. We, however, found no evidence of ASE in any of 44 informative sporadic cases analyzed. Taken together, we find that germline ASE of TGFBR1, as assayed in lymphoblastoid and colon epithelial cells of colon cancer patients, is a relatively rare event.
    • Recode-2: new design, new search tools, and many more genes.

      Bekaert, Michaël; Firth, Andrew E; Zhang, Yan; Gladyshev, Vadim N; Atkins, John F; Baranov, Pavel V; School of Biology and Environmental Science, University College Dublin, BioSciences Institute, University College Cork, Ireland. (2010-01)
      'Recoding' is a term used to describe non-standard read-out of the genetic code, and encompasses such phenomena as programmed ribosomal frameshifting, stop codon readthrough, selenocysteine insertion and translational bypassing. Although only a small proportion of genes utilize recoding in protein synthesis, accurate annotation of 'recoded' genes lags far behind annotation of 'standard' genes. In order to address this issue, provide a service to researchers in the field, and offer training data for developers of gene-annotation software, we have gathered together known cases of recoding within the Recode database. Recode-2 is an improved and updated version of the database. It provides access to detailed information on genes known to utilize translational recoding and allows complex search queries, browsing of recoding data and enhanced visualization of annotated sequence elements. At present, the Recode-2 database stores information on approximately 1500 genes that are known to utilize recoding in their expression--a factor of approximately three increase over the previous version of the database. Recode-2 is available at http://recode.ucc.ie.
    • Sequencing and analysis of an Irish human genome.

      Tong, Pin; Prendergast, James G D; Lohan, Amanda J; Farrington, Susan M; Cronin, Simon; Friel, Nial; Bradley, Dan G; Hardiman, Orla; Evans, Alex; Wilson, James F; et al. (2010)
      Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.
    • Sequencing illustrates the transcriptional response of Legionella pneumophila during infection and identifies seventy novel small non-coding RNAs.

      Weissenmayer, Barbara A; Prendergast, James G D; Lohan, Amanda J; Loftus, Brendan J; UCD Conway Institute for Biomolecular and Biomedical Research, Dublin, Ireland. (2011)
      Second generation sequencing has prompted a number of groups to re-interrogate the transcriptomes of several bacterial and archaeal species. One of the central findings has been the identification of complex networks of small non-coding RNAs that play central roles in transcriptional regulation in all growth conditions and for the pathogen's interaction with and survival within host cells. Legionella pneumophila is a gram-negative facultative intracellular human pathogen with a distinct biphasic lifestyle. One of its primary environmental hosts in the free-living amoeba Acanthamoeba castellanii and its infection by L. pneumophila mimics that seen in human macrophages. Here we present analysis of strand specific sequencing of the transcriptional response of L. pneumophila during exponential and post-exponential broth growth and during the replicative and transmissive phase of infection inside A. castellanii. We extend previous microarray based studies as well as uncovering evidence of a complex regulatory architecture underpinned by numerous non-coding RNAs. Over seventy new non-coding RNAs could be identified; many of them appear to be strain specific and in configurations not previously reported. We discover a family of non-coding RNAs preferentially expressed during infection conditions and identify a second copy of 6S RNA in L. pneumophila. We show that the newly discovered putative 6S RNA as well as a number of other non-coding RNAs show evidence for antisense transcription. The nature and extent of the non-coding RNAs and their expression patterns suggests that these may well play central roles in the regulation of Legionella spp. specific traits and offer clues as to how L. pneumophila adapts to its intracellular niche. The expression profiles outlined in the study have been deposited into Genbank's Gene Expression Omnibus (GEO) database under the series accession GSE27232.