Molecular Biology and Biochemistry, Department of

Receive updates for this collection

Ebbie: Automated Analysis and Storage of Small RNA Cloning Data Using a Dynamic Web Server

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2006
Abstract: 

BACKGROUND:DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study.RESULTS:Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on http://bioinformatics.org/ebbie/ webciteCONCLUSION:Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11].

Document type: 
Article

Evidence of Balanced Diversity at the Chicken Interleukin 4 Receptor Alpha Chain Locus

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2009
Abstract: 

Background: The comparative analysis of genome sequences emerging for several avian species with thefully sequenced chicken genome enables the genome-wide investigation of selective processes infunctionally important chicken genes. In particular, because of pathogenic challenges it is expected thatgenes involved in the chicken immune system are subject to particularly strong adaptive pressure.Signatures of selection detected by inter-species comparison may then be investigated at the populationlevel in global chicken populations to highlight potentially relevant functional polymorphisms.Results: Comparative evolutionary analysis of chicken (Gallus gallus) and zebra finch (Taeniopygia guttata)genes identified interleukin 4 receptor alpha-chain (IL-4Rα), a key cytokine receptor as a candidate with asignificant excess of substitutions at nonsynonymous sites, suggestive of adaptive evolution. Resequencingand detailed population genetic analysis of this gene in diverse village chickens from Asia and Africa,commercial broilers, and in outgroup species red jungle fowl (JF), grey JF, Ceylon JF, green JF, grey francolinand bamboo partridge, suggested elevated and balanced diversity across all populations at this gene, actingto preserve different high-frequency alleles at two nonsynonymous sites.Conclusion: Haplotype networks indicate that red JF is the primary contributor of diversity at chickenIL-4Rα: the signature of variation observed here may be due to the effects of domestication, admixtureand introgression, which produce high diversity. However, this gene is a key cytokine-binding receptor inthe immune system, so balancing selection related to the host response to pathogens cannot be excluded.

Document type: 
Article

Bursts and Horizontal Evolution of DNA Transposons in the Speciation of Pseudotetraploid Salmonids

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2007
Abstract: 

Background: Several genome duplications have occurred in the evolutionary history of teleostfish. In returning to a stable diploid state, the polyploid genome reorganized, and large portions arelost, while the fish lines evolved to numerous species. Large scale transposon movement has beenpostulated to play an important role in the genome reorganization process. We analyzed the DNAsequence of several large loci in Salmo salar and other species for the presence of DNA transposonfamilies.Results: We have identified bursts of activity of 14 families of DNA transposons (12 Tc1-like and2 piggyBac-like families, including 11 novel ones) in genome sequences of Salmo salar. Several ofthese families have similar sequences in a number of closely and distantly related fish, lamprey, andfrog species as well as in the parasite Schistosoma japonicum. Analysis of sequence similaritiesbetween copies within the families of these bursts demonstrates several waves of transpositionactivities coinciding with salmonid species divergence. Tc1-like families show a master gene-likecopying process, illustrated by extensive but short burst of copying activity, while the piggyBac-likefamilies show a more random copying pattern. Recent families may include copies with an openreading frame for an active transposase enzyme.Conclusion: We have identified defined bursts of transposon activity that make use of masterslaveand random mechanisms. The bursts occur well after hypothesized polyploidy events andcoincide with speciation events. Parasite-mediated lateral transfer of transposons are implicated.

Document type: 
Article

Sequencing the Genome of the Atlantic Salmon (Salmo Salar)

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

The International Collaboration to Sequence theAtlantic Salmon Genome (ICSASG) will produce agenome sequence that identifies and physically mapsall genes in the Atlantic salmon genome and acts as areference sequence for other salmonids.

Document type: 
Article

Distribution of Ancestral Proto-Actinopterygian Chromosome Arms within the Genomes of 4R-Derivative Salmonid Fishes (Rainbow Trout and Atlantic Salmon)

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2008
Abstract: 

Background: Comparative genomic studies suggest that the modern day assemblage of ray-finnedfishes have descended from an ancestral grouping of fishes that possessed 12–13 linkage groups. Alljawed vertebrates are postulated to have experienced two whole genome duplications (WGD) intheir ancestry (2R duplication). Salmonids have experienced one additional WGD (4R duplicationevent) compared to most extant teleosts which underwent a further 3R WGD compared to othervertebrates. We describe the organization of the 4R chromosomal segments of the proto-rayfinnedfish karyotype in Atlantic salmon and rainbow trout based upon their comparative syntenieswith two model species of 3R ray-finned fishes.Results: Evidence is presented for the retention of large whole-arm affinities between theancestral linkage groups of the ray-finned fishes, and the 50 homeologous chromosomal segmentsin Atlantic salmon and rainbow trout. In the comparisons between the two salmonid species, thereis also evidence for the retention of large whole-arm homeologous affinities that are associatedwith the retention of duplicated markers. Five of the 7 pairs of chromosomal arm regionsexpressing the highest level of duplicate gene expression in rainbow trout share homologoussynteny to the 5 pairs of homeologs with the greatest duplicate gene expression in Atlantic salmon.These regions are derived from proto-Actinopterygian linkage groups B, C, E, J and K.Conclusion: Two chromosome arms in Danio rerio and Oryzias latipes (descendants of the 3Rduplication) can, in most instances be related to at least 4 whole or partial chromosomal arms in the salmonid species. Multiple arm assignments in the two salmonid species do not clearly supporta 13 proto-linkage group model, and suggest that a 12 proto-linkage group arrangement (i.e., aseparate single chromosome duplication and ancestral fusion/fissions/recombination within theputative G/H/I groupings) may have occurred in the more basal soft-rayed fishes. We also foundevidence supporting the model that ancestral linkage group M underwent a single chromosomeduplication following the 3R duplication. In the salmonids, the M ancestral linkage groups arelocalized to 5 whole arm, and 3 partial arm regions (i.e., 6 whole arm regions expected). Thus, 3distinct ancestral linkage groups are postulated to have existed in the G/H and M lineagechromosomes in the ancestor of the salmonids.

Document type: 
Article

Convergent Evolution of RFX Transcription Factors and Ciliary Genes Predated the Origin of Metazoans

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Background: Intraflagellar transport (IFT) genes, which are critical for the development and function of cilia andflagella in metazoans, are tightly regulated by the Regulatory Factor X (RFX) transcription factors (TFs). However, howand when their evolutionary relationship was established remains unknown.Results: We have identified evidence suggesting that RFX TFs and IFT genes evolved independently and theirevolution converged before the first appearance of metazoans. Both ciliary genes and RFX TFs exist in all metazoans aswell as some unicellular eukaryotes. However, while RFX TFs and IFT genes are found simultaneously in all sequencedmetazoan genomes, RFX TFs do not co-exist with IFT genes in most pre-metazoans and thus do not regulate them inthese organisms. For example, neither the budding yeast nor the fission yeast possesses cilia although both have welldefinedRFX TFs. Conversely, most unicellular eukaryotes, including the green alga Chlamydomonas reinhardtii, havetypical cilia and well conserved IFT genes but lack RFX TFs. Outside of metazoans, RFX TFs and IFT genes co-exist onlyin choanoflagellates including M. brevicollis, and only one fungus Allomyces macrogynus of the 51 sequenced fungusgenomes. M. brevicollis has two putative RFX genes and a full complement of ciliary genes.Conclusions: The evolution of RFX TFs and IFT genes were independent in pre-metazoans. We propose that theirconvergence in evolution, or the acquired transcriptional regulation of IFT genes by RFX TFs, played a pivotal role in theestablishment of metazoan.

Document type: 
Article

Structural Characterization of Genomes by Large Scale Sequence-Structure Threading: Application of Reliability Analysis in Structural Genomics

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2004
Abstract: 

Background: We establish that the occurrence of protein folds among genomes can be accuratelydescribed with a Weibull function. Systems which exhibit Weibull character can be interpretedwith reliability theory commonly used in engineering analysis. For instance, Weibull distributionsare widely used in reliability, maintainability and safety work to model time-to-failure of mechanicaldevices, mechanisms, building constructions and equipment.Results: We have found that the Weibull function describes protein fold distribution within andamong genomes more accurately than conventional power functions which have been used in anumber of structural genomic studies reported to date.It has also been found that the Weibull reliability parameter β for protein fold distributions variesbetween genomes and may reflect differences in rates of gene duplication in evolutionary historyof organisms.Conclusions: The results of this work demonstrate that reliability analysis can provide usefulinsights and testable predictions in the fields of comparative and structural genomics.

Document type: 
Article

Identification of Ciliary and Ciliopathy Genes in Caenorhabditis Elegans through Comparative Genomics

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2006
Abstract: 

Background: The recent availability of genome sequences of multiple related Caenorhabditis species hasmade it possible to identify, using comparative genomics, similarly transcribed genes in Caenorhabditiselegans and its sister species. Taking this approach, we have identified numerous novel ciliary genes in C.elegans, some of which may be orthologs of unidentified human ciliopathy genes.Results: By screening for genes possessing canonical X-box sequences in promoters of threeCaenorhabditis species, namely C. elegans, C. briggsae and C. remanei, we identified 93 genes (including knownX-box regulated genes) that encode putative components of ciliated neurons in C. elegans and are subjectto the same regulatory control. For many of these genes, restricted anatomical expression in ciliated cellswas confirmed, and control of transcription by the ciliogenic DAF-19 RFX transcription factor wasdemonstrated by comparative transcriptional profiling of different tissue types and of daf-19(+) and daf-19(-) animals. Finally, we demonstrate that the dye-filling defect of dyf-5(mn400) animals, which is indicativeof compromised exposure of cilia to the environment, is caused by a nonsense mutation in the serine/threonine protein kinase gene M04C9.5.Conclusion: Our comparative genomics-based predictions may be useful for identifying genes involved inhuman ciliopathies, including Bardet-Biedl Syndrome (BBS), since the C. elegans orthologs of known humanBBS genes contain X-box motifs and are required for normal dye filling in C. elegans ciliated neurons.

Document type: 
Article

Identification And Characterization Of Novel Human Tissue-Specific Rfx Transcription Factors

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2008
Abstract: 

BACKGROUND:Five regulatory factor X (RFX) transcription factors (TFs)-RFX1-5-have been previously characterized in the human genome, which have been demonstrated to be critical for development and are associated with an expanding list of serious human disease conditions including major histocompatibility (MHC) class II deficiency and ciliaophathies.RESULTS:In this study, we have identified two additional RFX genes-RFX6 and RFX7-in the current human genome sequences. Both RFX6 and RFX7 are demonstrated to be winged-helix TFs and have well conserved RFX DNA binding domains (DBDs), which are also found in winged-helix TFs RFX1-5. Phylogenetic analysis suggests that the RFX family in the human genome has undergone at least three gene duplications in evolution and the seven human RFX genes can be clearly categorized into three subgroups: (1) RFX1-3, (2) RFX4 and RFX6, and (3) RFX5 and RFX7. Our functional genomics analysis suggests that RFX6 and RFX7 have distinct expression profiles. RFX6 is expressed almost exclusively in the pancreatic islets, while RFX7 has high ubiquitous expression in nearly all tissues examined, particularly in various brain tissues.CONCLUSION:The identification and further characterization of these two novel RFX genes hold promise for gaining critical insight into development and many disease conditions in mammals, potentially leading to identification of disease genes and biomarkers.

Document type: 
Article