Molecular Biology and Biochemistry, Department of

Receive updates for this collection

Genome Comparison of Human and Non-Human Malaria Parasites Reveals Species Subset-Specific Genes Potentially Linked to Human Disease

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Genes underlying important phenotypic differences between Plasmodium species, the causative agents of malaria, are frequently found in only a subset of species and cluster at dynamically evolving subtelomeric regions of chromosomes. We hypothesized that chromosome-internal regions of Plasmodium genomes harbour additional species subset-specific genes that underlie differences in human pathogenicity, human-to-human transmissibility, and human virulence. We combined sequence similarity searches with synteny block analyses to identify species subset-specific genes in chromosome-internal regions of six published Plasmodium genomes, including Plasmodium falciparum, Plasmodium vivax, Plasmodium knowlesi, Plasmodium yoelii, Plasmodium berghei, and Plasmodium chabaudi. To improve comparative analysis, we first revised incorrectly annotated gene models using homology-based gene finders and examined putative subset-specific genes within syntenic contexts. Confirmed subset-specific genes were then analyzed for their role in biological pathways and examined for molecular functions using publicly available databases. We identified 16 genes that are well conserved in the three primate parasites but not found in rodent parasites, including three key enzymes of the thiamine (vitamin B1) biosynthesis pathway. Thirteen genes were found to be present in both human parasites but absent in the monkey parasite P. knowlesi, including genes specifically upregulated in sporozoites or gametocytes that could be linked to parasite transmission success between humans. Furthermore, we propose 15 chromosome-internal P. falciparum-specific genes as new candidate genes underlying increased human virulence and detected a currently uncharacterized cluster of P. vivax-specific genes on chromosome 6 likely involved in erythrocyte invasion. In conclusion, Plasmodium species harbour many chromosome-internal differences in the form of protein-coding genes, some of which are potentially linked to human disease and thus promising leads for future laboratory research.

Document type: 
Article
File(s): 

Mutations in a Guanylate Cyclase GCY-35/GCY-36 Modify Bardet-Biedl Syndrome–Associated Phenotypes in Caenorhabditis elegans

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Ciliopathies are pleiotropic and genetically heterogeneous disorders caused by defective development and function of the primary cilium. Bardet-Biedl syndrome (BBS) proteins localize to the base of cilia and undergo intraflagellar transport, and the loss of their functions leads to a multisystemic ciliopathy. Here we report the identification of mutations in guanylate cyclases (GCYs) as modifiers of Caenorhabditis elegans bbs endophenotypes. The loss of GCY-35 or GCY-36 results in suppression of the small body size, developmental delay, and exploration defects exhibited by multiple bbs mutants. Moreover, an effector of cGMP signalling, a cGMP-dependent protein kinase, EGL-4, also modifies bbs mutant defects. We propose that a misregulation of cGMP signalling, which underlies developmental and some behavioural defects of C. elegans bbs mutants, may also contribute to some BBS features in other organisms.

Document type: 
Article
File(s): 

Mos1-Mediated Transgenesis to Probe Consequences of Single Gene Mutations in Variation-Rich Isolates of Caenorhabditis elegans

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2012-11-14
Abstract: 

Caenorhabditis elegans, especially the N2 isolate, is an invaluable biological model system. Numerous additional natural C. elegans isolates have been shown to have unexpected genotypic and phenotypic variations which has encouraged researchers to use next generation sequencing methodology to develop a more complete picture of genotypic variations among the isolates. To understand the phenotypic effects of a genomic variation (GV) on a single gene, in a variation-rich genetic background, one should analyze that particular GV in a well understood genetic background. In C. elegans, the analysis is usually done in N2, which requires extensive crossing to bring in the GV. This can be a very time consuming procedure thus it is important to establish a fast and efficient approach to test the effect of GVs from different isolates in N2. Here we use a Mos1-mediated single-copy insertion (MosSCI) method for phenotypic assessments of GVs from the variation-rich Hawaiian strain CB4856 in N2. Specifically, we investigate effects of variations identified in the CB4856 strain on tac-1 which is an essential gene that is necessary for mitotic spindle elongation and pronuclear migration. We show the usefulness of the MosSCI method by using EU1004 tac-1(or402) as a control. or402 is a temperature sensitive lethal allele within a well-conserved TACC domain (transforming acidic coiled-coil) that results in a leucine to phenylalanine change at amino acid 229. CB4856 contains a variation that affects the second exon of tac-1 causing a cysteine to tryptophan change at amino acid 94 also within the TACC domain. Using the MosSCI method, we analyze tac-1 from CB4856 in the N2 background and demonstrate that the C94W change, albeit significant, does not cause any obvious decrease in viability. This MosSCI method has proven to be a rapid and efficient way to analyze GVs.

Document type: 
Article
File(s): 

Targeted Assembly of Short Sequence Reads

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.

Document type: 
Article

Quantitative Trait Locus (QTL) Mapping Reveals a Role for Unstudied Genes in Aspergillus Virulence

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Infections caused by the fungus Aspergillus are a major cause of morbidity and mortality in immunocompromised populations. To identify genes required for virulence that could be used as targets for novel treatments, we mapped quantitative trait loci (QTL) affecting virulence in the progeny of a cross between two strains of A. nidulans (FGSC strains A4 and A91). We genotyped 61 progeny at 739 single nucleotide polymorphisms (SNP) spread throughout the genome, and constructed a linkage map that was largely consistent with the genomic sequence, with the exception of one potential inversion of ~527 kb on Chromosome V. The estimated genome size was 3705 cM and the average intermarker spacing was 5.0 cM. The average ratio of physical distance to genetic distance was 8.1 kb/cM, which is similar to previous estimates, and variation in recombination rate was significantly positively correlated with GC content, a pattern seen in other taxa. To map QTL affecting virulence, we measured the ability of each progeny strain to kill model hosts, larvae of the wax moth Galleria mellonella. We detected three QTL affecting in vivo virulence that were distinct from QTL affecting in vitro growth, and mapped the virulence QTL to regions containing 7–24 genes, excluding genes with no sequence variation between the parental strains and genes with only synonymous SNPs. None of the genes in our QTL target regions have been previously associated with virulence in Aspergillus, and almost half of these genes are currently annotated as “hypothetical”. This study is the first to map QTL affecting the virulence of a fungal pathogen in an animal host, and our results illustrate the power of this approach to identify a short list of unknown genes for further investigation.

Document type: 
Article
File(s): 

Comparison of Antibody Repertoires Produced by HIV-1 Infection, Other Chronic and Acute Infections, and Systemic Autoimmune Disease

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Background

Antibodies (Abs) produced during HIV-1 infection rarely neutralize a broad range of viral isolates; only eight broadly-neutralizing (bNt) monoclonal (M)Abs have been isolated. Yet, to be effective, an HIV-1 vaccine may have to elicit the essential features of these MAbs. The V genes of all of these bNt MAbs are highly somatically mutated, and the VH genes of five of them encode a long (≥20 aa) third complementarity-determining region (CDR-H3). This led us to question whether long CDR-H3s and high levels of somatic mutation (SM) are a preferred feature of anti-HIV bNt MAbs, or if other adaptive immune responses elicit them in general.

Methodology and Principal Findings

We assembled a VH-gene sequence database from over 700 human MAbs of known antigen specificity isolated from chronic (viral) infections (ChI), acute (bacterial and viral) infections (AcI), and systemic autoimmune diseases (SAD), and compared their CDR-H3 length, number of SMs and germline VH-gene usage. We found that anti-HIV Abs, regardless of their neutralization breadth, tended to have long CDR-H3s and high numbers of SMs. However, these features were also common among Abs associated with other chronic viral infections. In contrast, Abs from acute viral infections (but not bacterial infections) tended to have relatively short CDR-H3s and a low number of SMs, whereas SAD Abs were generally intermediate in CDR-H3 length and number of SMs. Analysis of VH gene usage showed that ChI Abs also tended to favor distal germline VH-genes (particularly VH1-69), especially in Abs bearing long CDR-H3s.

Conclusions and Significance

The striking difference between the Abs produced during chronic vs. acute viral infection suggests that Abs bearing long CDR-H3s, high levels of SM and VH1-69 gene usage may be preferentially selected during persistent infection.

Document type: 
Article
File(s): 

Localization of a Guanylyl Cyclase to Chemosensory Cilia Requires the Novel Ciliary MYND Domain Protein DAF-25

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

In harsh conditions, Caenorhabditis elegans arrests development to enter a non-aging, resistant diapause state called the dauer larva. Olfactory sensation modulates the TGF-β and insulin signaling pathways to control this developmental decision. Four mutant alleles of daf-25 (abnormal DAuer Formation) were isolated from screens for mutants exhibiting constitutive dauer formation and found to be defective in olfaction. The daf-25 dauer phenotype is suppressed by daf-10/IFT122 mutations (which disrupt ciliogenesis), but not by daf-6/PTCHD3 mutations (which prevent environmental exposure of sensory cilia), implying that DAF-25 functions in the cilia themselves. daf-25 encodes the C. elegans ortholog of mammalian Ankmy2, a MYND domain protein of unknown function. Disruption of DAF-25, which localizes to sensory cilia, produces no apparent cilia structure anomalies, as determined by light and electron microscopy. Hinting at its potential function, the dauer phenotype, epistatic order, and expression profile of daf-25 are similar to daf-11, which encodes a cilium-localized guanylyl cyclase. Indeed, we demonstrate that DAF-25 is required for proper DAF-11 ciliary localization. Furthermore, the functional interaction is evolutionarily conserved, as mouse Ankmy2 interacts with guanylyl cyclase GC1 from ciliary photoreceptors. The interaction may be specific because daf-25 mutants have normally-localized OSM-9/TRPV4, TAX-4/CNGA1, CHE-2/IFT80, CHE-11/IFT140, CHE-13/IFT57, BBS-8, OSM-5/IFT88, and XBX-1/D2LIC in the cilia. Intraflagellar transport (IFT) (required to build cilia) is not defective in daf-25 mutants, although the ciliary localization of DAF-25 itself is influenced in che-11 mutants, which are defective in retrograde IFT. In summary, we have discovered a novel ciliary protein that plays an important role in cGMP signaling by localizing a guanylyl cyclase to the sensory organelle.

Document type: 
Article
File(s): 

Module Discovery by Exhaustive Search for Densely Connected, Co-Expressed Regions in Biomolecular Interaction Networks

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Background

Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented.

Methodology/Principal Findings

We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples.

Conclusion/Significance

We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets.

Document type: 
Article
File(s): 

Genome-Wide Comparative Gene Family Classification

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species.

Document type: 
Article

Identification of the Regulatory Logic Controlling Salmonella Pathoadaptation by the SsrA-SsrB Two-Component System

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Sequence data from the past decade has laid bare the significance of horizontal gene transfer in creating genetic diversity in the bacterial world. Regulatory evolution, in which non-coding DNA is mutated to create new regulatory nodes, also contributes to this diversity to allow niche adaptation and the evolution of pathogenesis. To survive in the host environment, Salmonella enterica uses a type III secretion system and effector proteins, which are activated by the SsrA-SsrB two-component system in response to the host environment. To better understand the phenomenon of regulatory evolution in S. enterica, we defined the SsrB regulon and asked how this transcription factor interacts with the cis-regulatory region of target genes. Using ChIP-on-chip, cDNA hybridization, and comparative genomics analyses, we describe the SsrB-dependent regulon of ancestral and horizontally acquired genes. Further, we used a genetic screen and computational analyses integrating experimental data from S. enterica and sequence data from an orthologous regulatory system in the insect endosymbiont, Sodalis glossinidius, to identify the conserved yet flexible palindrome sequence that defines DNA recognition by SsrB. Mutational analysis of a representative promoter validated this palindrome as the minimal architecture needed for regulatory input by SsrB. These data provide a high-resolution map of a regulatory network and the underlying logic enabling pathogen adaptation to a host.

Document type: 
Article
File(s):