Molecular Biology and Biochemistry, Department of

Receive updates for this collection

Mos1-Mediated Transgenesis to Probe Consequences of Single Gene Mutations in Variation-Rich Isolates of Caenorhabditis elegans

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2012-11-14
Abstract: 

Caenorhabditis elegans, especially the N2 isolate, is an invaluable biological model system. Numerous additional natural C. elegans isolates have been shown to have unexpected genotypic and phenotypic variations which has encouraged researchers to use next generation sequencing methodology to develop a more complete picture of genotypic variations among the isolates. To understand the phenotypic effects of a genomic variation (GV) on a single gene, in a variation-rich genetic background, one should analyze that particular GV in a well understood genetic background. In C. elegans, the analysis is usually done in N2, which requires extensive crossing to bring in the GV. This can be a very time consuming procedure thus it is important to establish a fast and efficient approach to test the effect of GVs from different isolates in N2. Here we use a Mos1-mediated single-copy insertion (MosSCI) method for phenotypic assessments of GVs from the variation-rich Hawaiian strain CB4856 in N2. Specifically, we investigate effects of variations identified in the CB4856 strain on tac-1 which is an essential gene that is necessary for mitotic spindle elongation and pronuclear migration. We show the usefulness of the MosSCI method by using EU1004 tac-1(or402) as a control. or402 is a temperature sensitive lethal allele within a well-conserved TACC domain (transforming acidic coiled-coil) that results in a leucine to phenylalanine change at amino acid 229. CB4856 contains a variation that affects the second exon of tac-1 causing a cysteine to tryptophan change at amino acid 94 also within the TACC domain. Using the MosSCI method, we analyze tac-1 from CB4856 in the N2 background and demonstrate that the C94W change, albeit significant, does not cause any obvious decrease in viability. This MosSCI method has proven to be a rapid and efficient way to analyze GVs.

Document type: 
Article
File(s): 

Targeted Assembly of Short Sequence Reads

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.

Document type: 
Article

Quantitative Trait Locus (QTL) Mapping Reveals a Role for Unstudied Genes in Aspergillus Virulence

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Infections caused by the fungus Aspergillus are a major cause of morbidity and mortality in immunocompromised populations. To identify genes required for virulence that could be used as targets for novel treatments, we mapped quantitative trait loci (QTL) affecting virulence in the progeny of a cross between two strains of A. nidulans (FGSC strains A4 and A91). We genotyped 61 progeny at 739 single nucleotide polymorphisms (SNP) spread throughout the genome, and constructed a linkage map that was largely consistent with the genomic sequence, with the exception of one potential inversion of ~527 kb on Chromosome V. The estimated genome size was 3705 cM and the average intermarker spacing was 5.0 cM. The average ratio of physical distance to genetic distance was 8.1 kb/cM, which is similar to previous estimates, and variation in recombination rate was significantly positively correlated with GC content, a pattern seen in other taxa. To map QTL affecting virulence, we measured the ability of each progeny strain to kill model hosts, larvae of the wax moth Galleria mellonella. We detected three QTL affecting in vivo virulence that were distinct from QTL affecting in vitro growth, and mapped the virulence QTL to regions containing 7–24 genes, excluding genes with no sequence variation between the parental strains and genes with only synonymous SNPs. None of the genes in our QTL target regions have been previously associated with virulence in Aspergillus, and almost half of these genes are currently annotated as “hypothetical”. This study is the first to map QTL affecting the virulence of a fungal pathogen in an animal host, and our results illustrate the power of this approach to identify a short list of unknown genes for further investigation.

Document type: 
Article
File(s): 

Comparison of Antibody Repertoires Produced by HIV-1 Infection, Other Chronic and Acute Infections, and Systemic Autoimmune Disease

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Background

Antibodies (Abs) produced during HIV-1 infection rarely neutralize a broad range of viral isolates; only eight broadly-neutralizing (bNt) monoclonal (M)Abs have been isolated. Yet, to be effective, an HIV-1 vaccine may have to elicit the essential features of these MAbs. The V genes of all of these bNt MAbs are highly somatically mutated, and the VH genes of five of them encode a long (≥20 aa) third complementarity-determining region (CDR-H3). This led us to question whether long CDR-H3s and high levels of somatic mutation (SM) are a preferred feature of anti-HIV bNt MAbs, or if other adaptive immune responses elicit them in general.

Methodology and Principal Findings

We assembled a VH-gene sequence database from over 700 human MAbs of known antigen specificity isolated from chronic (viral) infections (ChI), acute (bacterial and viral) infections (AcI), and systemic autoimmune diseases (SAD), and compared their CDR-H3 length, number of SMs and germline VH-gene usage. We found that anti-HIV Abs, regardless of their neutralization breadth, tended to have long CDR-H3s and high numbers of SMs. However, these features were also common among Abs associated with other chronic viral infections. In contrast, Abs from acute viral infections (but not bacterial infections) tended to have relatively short CDR-H3s and a low number of SMs, whereas SAD Abs were generally intermediate in CDR-H3 length and number of SMs. Analysis of VH gene usage showed that ChI Abs also tended to favor distal germline VH-genes (particularly VH1-69), especially in Abs bearing long CDR-H3s.

Conclusions and Significance

The striking difference between the Abs produced during chronic vs. acute viral infection suggests that Abs bearing long CDR-H3s, high levels of SM and VH1-69 gene usage may be preferentially selected during persistent infection.

Document type: 
Article
File(s): 

Localization of a Guanylyl Cyclase to Chemosensory Cilia Requires the Novel Ciliary MYND Domain Protein DAF-25

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

In harsh conditions, Caenorhabditis elegans arrests development to enter a non-aging, resistant diapause state called the dauer larva. Olfactory sensation modulates the TGF-β and insulin signaling pathways to control this developmental decision. Four mutant alleles of daf-25 (abnormal DAuer Formation) were isolated from screens for mutants exhibiting constitutive dauer formation and found to be defective in olfaction. The daf-25 dauer phenotype is suppressed by daf-10/IFT122 mutations (which disrupt ciliogenesis), but not by daf-6/PTCHD3 mutations (which prevent environmental exposure of sensory cilia), implying that DAF-25 functions in the cilia themselves. daf-25 encodes the C. elegans ortholog of mammalian Ankmy2, a MYND domain protein of unknown function. Disruption of DAF-25, which localizes to sensory cilia, produces no apparent cilia structure anomalies, as determined by light and electron microscopy. Hinting at its potential function, the dauer phenotype, epistatic order, and expression profile of daf-25 are similar to daf-11, which encodes a cilium-localized guanylyl cyclase. Indeed, we demonstrate that DAF-25 is required for proper DAF-11 ciliary localization. Furthermore, the functional interaction is evolutionarily conserved, as mouse Ankmy2 interacts with guanylyl cyclase GC1 from ciliary photoreceptors. The interaction may be specific because daf-25 mutants have normally-localized OSM-9/TRPV4, TAX-4/CNGA1, CHE-2/IFT80, CHE-11/IFT140, CHE-13/IFT57, BBS-8, OSM-5/IFT88, and XBX-1/D2LIC in the cilia. Intraflagellar transport (IFT) (required to build cilia) is not defective in daf-25 mutants, although the ciliary localization of DAF-25 itself is influenced in che-11 mutants, which are defective in retrograde IFT. In summary, we have discovered a novel ciliary protein that plays an important role in cGMP signaling by localizing a guanylyl cyclase to the sensory organelle.

Document type: 
Article
File(s): 

Module Discovery by Exhaustive Search for Densely Connected, Co-Expressed Regions in Biomolecular Interaction Networks

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Background

Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented.

Methodology/Principal Findings

We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples.

Conclusion/Significance

We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets.

Document type: 
Article
File(s): 

Genome-Wide Comparative Gene Family Classification

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species.

Document type: 
Article

Identification of the Regulatory Logic Controlling Salmonella Pathoadaptation by the SsrA-SsrB Two-Component System

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Sequence data from the past decade has laid bare the significance of horizontal gene transfer in creating genetic diversity in the bacterial world. Regulatory evolution, in which non-coding DNA is mutated to create new regulatory nodes, also contributes to this diversity to allow niche adaptation and the evolution of pathogenesis. To survive in the host environment, Salmonella enterica uses a type III secretion system and effector proteins, which are activated by the SsrA-SsrB two-component system in response to the host environment. To better understand the phenomenon of regulatory evolution in S. enterica, we defined the SsrB regulon and asked how this transcription factor interacts with the cis-regulatory region of target genes. Using ChIP-on-chip, cDNA hybridization, and comparative genomics analyses, we describe the SsrB-dependent regulon of ancestral and horizontally acquired genes. Further, we used a genetic screen and computational analyses integrating experimental data from S. enterica and sequence data from an orthologous regulatory system in the insect endosymbiont, Sodalis glossinidius, to identify the conserved yet flexible palindrome sequence that defines DNA recognition by SsrB. Mutational analysis of a representative promoter validated this palindrome as the minimal architecture needed for regulatory input by SsrB. These data provide a high-resolution map of a regulatory network and the underlying logic enabling pathogen adaptation to a host.

Document type: 
Article
File(s): 

The Imprinted Retrotransposon-Like Gene PEG11 (RTL1) Is Expressed as a Full-Length Protein in Skeletal Muscle from Callipyge Sheep

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Members of the Ty3-Gypsy retrotransposon family are rare in mammalian genomes despite their abundance in invertebrates and some vertebrates. These elements contain a gag-pol-like structure characteristic of retroviruses but have lost their ability to retrotranspose into the mammalian genome and are thought to be inactive relics of ancient retrotransposition events. One of these retrotransposon-like elements, PEG11 (also called RTL1) is located at the distal end of ovine chromosome 18 within an imprinted gene cluster that is highly conserved in placental mammals. The region contains several conserved imprinted genes including BEGAIN, DLK1, DAT, GTL2 (MEG3), PEG11 (RTL1), PEG11as, MEG8, MIRG and DIO3. An intergenic point mutation between DLK1 and GTL2 causes muscle hypertrophy in callipyge sheep and is associated with large changes in expression of the genes linked in cis between DLK1 and MEG8. It has been suggested that over-expression of DLK1 is the effector of the callipyge phenotype; however, PEG11 gene expression is also strongly correlated with the emergence of the muscling phenotype as a function of genotype, muscle type and developmental stage. To date, there has been no direct evidence that PEG11 encodes a protein, especially as its anti-sense transcript (PEG11as) contains six miRNA that cause cleavage of the PEG11 transcript. Using immunological and mass spectrometry approaches we have directly identified the full-length PEG11 protein from postnatal nuclear preparations of callipyge skeletal muscle and conclude that its over-expression may be involved in inducing muscle hypertrophy. The developmental expression pattern of the PEG11 gene is consistent with the callipyge mutation causing recapitulation of the normal fetal-like gene expression program during postnatal development. Analysis of the PEG11 sequence indicates strong conservation of the regions encoding the antisense microRNA and in at least two cases these correspond with structural or functional domains of the protein suggesting co-evolution of the sense and antisense genes.

Document type: 
Article
File(s): 

The Association of Virulence Factors with Genomic Islands

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2009
Abstract: 

Background

It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage.

Methodology/Principal Findings

We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p<1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more “offensive” functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study.

Conclusions/Significance

This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. “Offensive” virulence factors, as opposed to host-interaction factors, may more often be a recently acquired trait (on an evolutionary time scale detected by GI analysis). Newly identified pathogen-associated genes warrant further study. We discuss the implications of these results, which cement the significant role of GIs in the evolution of many pathogens.

Document type: 
Article