Computing Science, School of

Receive updates for this collection

Genome-wide variations in a natural isolate of the nematode Caenorhabditis elegans

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Background

Increasing genetic and phenotypic differences found among natural isolates of C. elegans have encouraged researchers to explore the natural variation of this nematode species.

Results

Here we report on the identification of genomic differences between the reference strain N2 and the Hawaiian strain CB4856, one of the most genetically distant strains from N2. To identify both small- and large-scale genomic variations (GVs), we have sequenced the CB4856 genome using both Roche 454 (~400 bps single reads) and Illumina GA DNA sequencing methods (101 bps paired-end reads). Compared to previously described variants (available in WormBase), our effort uncovered twice as many single nucleotide variants (SNVs) and increased the number of small InDels almost 20-fold. Moreover, we identified and validated large insertions, most of which range from 150 bps to 1.2 kb in length in the CB4856 strain. Identified GVs had a widespread impact on protein-coding sequences, including 585 single-copy genes that have associated severe phenotypes of reduced viability in RNAi and genetics studies. Sixty of these genes are homologs of human genes associated with diseases. Furthermore, our work confirms previously identified GVs associated with differences in behavioural and biological traits between the N2 and CB4856 strains.

Conclusions

The identified GVs provide a rich resource for future studies that aim to explain the genetic basis for other trait differences between the N2 and CB4856 strains.

Document type: 
Article
File(s): 

Ribofsm: Frequent Subgraph Mining For the Discovery of RNA Structures and Interactions

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Frequent subgraph mining is a useful method for extracting meaningful patterns from a set of graphs or a single large graph. Here, the graph represents all possible RNA structures and interactions. Patterns that are significantly more frequent in this graph over a random graph are extracted. We hypothesize that these patterns are most likely to represent biological mechanisms. The graph representation used is a directed dual graph, extended to handle intermolecular interactions. The graph is sampled for subgraphs, which are labeled using a canonical labeling method and counted. The resulting patterns are compared to those created from a randomized dataset and scored. The algorithm was applied to the mitochondrial genome of the kinetoplastid species Trypanosoma brucei, which has a unique RNA editing mechanism. The most significant patterns contain two stem-loops, indicative of gRNA, and represent interactions of these structures with target mRNA.

Document type: 
Article
File(s): 

Creating Groups with Similar Expected Behavioural Response in Randomized Controlled Trials: A Fuzzy Cognitive Map Approach

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Background

Controlling bias is key to successful randomized controlled trials for behaviour change. Bias can be generated at multiple points during a study, for example, when participants are allocated to different groups. Several methods of allocations exist to randomly distribute participants over the groups such that their prognostic factors (e.g., socio-demographic variables) are similar, in an effort to keep participants’ outcomes comparable at baseline. Since it is challenging to create such groups when all prognostic factors are taken together, these factors are often balanced in isolation or only the ones deemed most relevant are balanced. However, the complex interactions among prognostic factors may lead to a poor estimate of behaviour, causing unbalanced groups at baseline, which may introduce accidental bias.

Methods

We present a novel computational approach for allocating participants to different groups. Our approach automatically uses participants’ experiences to model (the interactions among) their prognostic factors and infer how their behaviour is expected to change under a given intervention. Participants are then allocated based on their inferred behaviour rather than on selected prognostic factors.

Results

In order to assess the potential of our approach, we collected two datasets regarding the behaviour of participants (n = 430 and n = 187). The potential of the approach on larger sample sizes was examined using synthetic data. All three datasets highlighted that our approach could lead to groups with similar expected behavioural changes.

Conclusions

The computational approach proposed here can complement existing statistical approaches when behaviours involve numerous complex relationships, and quantitative data is not readily available to model these relationships. The software implementing our approach and commonly used alternatives is provided at no charge to assist practitioners in the design of their own studies and to compare participants' allocations.

Document type: 
Article
File(s): 

Whole Genome Sequencing of Turkish Genomes Reveals Functional Private Alleles and Impact of Genetic Interactions with Europe, Asia and Africa

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Background

Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×).

Results

We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency.

Conclusion

This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey.

Document type: 
Article
File(s): 

Approaches to Non-Intrusive Load Monitoring (NILM) in the Home

Author: 
Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2012-10-24
Abstract: 

When designing and implementing an intelligent energy conservation system for the home, it is essential to have insight into the activities and actions of the occupants. In particular, it is important to understand what appliances are being used and when. In the computational sustainability research community this is known as load disaggregation or Non-Intrusive Load Monitoring (NILM). NILM is a foundational algorithm that can disaggregate a home’s power usage into the individual appliances that are running, identify energy conservation opportunities. This depth report will focus on NILM algorithms, their use and evaluation. We will examine and evaluate the anatomy of NILM, looking at techniques using load monitoring, event detection, feature ex- traction, classification, and accuracy measurement. 

Document type: 
Technical Report
File(s): 

Changing Risk Behaviours and the HIV Epidemic: A Mathematical Analysis in the Context of Treatment as Prevention

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

Expanding access to highly active antiretroviral therapy (HAART) has become an important approach to HIV prevention in recent years. Previous studies suggest that concomitant changes in risk behaviours may either help or hinder programs that use a Treatment as Prevention strategy.

Analysis

We consider HIV-related risk behaviour as a social contagion in a deterministic compartmental model, which treats risk behaviour and HIV infection as linked processes, where acquiring risk behaviour is a prerequisite for contracting HIV. The equilibrium behaviour of the model is analysed to determine epidemic outcomes under conditions of expanding HAART coverage along with risk behaviours that change with HAART coverage. We determined the potential impact of changes in risk behaviour on the outcomes of Treatment as Prevention strategies. Model results show that HIV incidence and prevalence decline only above threshold levels of HAART coverage, which depends strongly on risk behaviour parameter values. Expanding HAART coverage with simultaneous reduction in risk behaviour act synergistically to accelerate the drop in HIV incidence and prevalence. Above the thresholds, additional HAART coverage is always sufficient to reverse the impact of HAART optimism on incidence and prevalence. Applying the model to an HIV epidemic in Vancouver, Canada, showed no evidence of HAART optimism in that setting.

Conclusions

Our results suggest that Treatment as Prevention has significant potential for controlling the HIV epidemic once HAART coverage reaches a threshold. Furthermore, expanding HAART coverage combined with interventions targeting risk behaviours amplify the preventive impact, potentially driving the HIV epidemic to elimination.

Document type: 
Article
File(s): 

Video Game Telemetry as a Critical Tool in the Study of Complex Skill Learning

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Cognitive science has long shown interest in expertise, in part because prediction and control of expert development would have immense practical value. Most studies in this area investigate expertise by comparing experts with novices. The reliance on contrastive samples in studies of human expertise only yields deep insight into development where differences are important throughout skill acquisition. This reliance may be pernicious where the predictive importance of variables is not constant across levels of expertise. Before the development of sophisticated machine learning tools for data mining larger samples, and indeed, before such samples were available, it was difficult to test the implicit assumption of static variable importance in expertise development. To investigate if this reliance may have imposed critical restrictions on the understanding of complex skill development, we adopted an alternative method, the online acquisition of telemetry data from a common daily activity for many: video gaming. Using measures of cognitive-motor, attentional, and perceptual processing extracted from game data from 3360 Real-Time Strategy players at 7 different levels of expertise, we identified 12 variables relevant to expertise. We show that the static variable importance assumption is false - the predictive importance of these variables shifted as the levels of expertise increased - and, at least in our dataset, that a contrastive approach would have been misleading. The finding that variable importance is not static across levels of expertise suggests that large, diverse datasets of sustained cognitive-motor performance are crucial for an understanding of expertise in real-world contexts. We also identify plausible cognitive markers of expertise.

Document type: 
Article
File(s): 

Improvement and Performance Evaluation for Multimedia Files Transmission in Vehicle-Based DTNs

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

In recent years, P2P file sharing has been widely embraced and becomes the largest application of the Internet traffic. And the development of automobile industry has promoted a trend of deploying Peer-to-Peer (P2P) networks over vehicle ad hoc networks (VANETs) for mobile content distribution. Due to the high mobility of nodes, nodes’ limited radio transmission range and sparse distribution, VANETs are divided and links are interrupted intermittently. At this moment, VANETs may become Vehicle-based Delay Tolerant Network (VDTNs). Therefore, this work proposes an Optimal Fragmentation-based Multimedia Transmission scheme (OFMT) based on P2P lookup protocol in VDTNs, which can enable multimedia files to be sent to the receiver fast and reliably in wireless mobile P2P networks over VDTNs. In addition, a method of calculating the most suitable size of the fragment is provided, which is tested and verified in the simulation. And we also show that OFMT can defend a certain degree of DoS attack and senders can freely join and leave the wireless mobile P2P network. Simulation results demonstrate that the proposed scheme can significantly improve the performance of the file delivery rate and shorten the file delivery delay compared with the existing schemes.

Document type: 
Article
File(s): 

Barnacle: Detecting and Characterizing Tandem Duplications and Fusions in Transcriptome Assemblies

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers.

Results

We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets.

Conclusions

Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.

Document type: 
Article
File(s): 

Analyzing The Impact Of Social Factors On Homelessness: A Fuzzy Cognitive Map Approach

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

The forces which affect homelessness are complex and often interactive in nature. Social forces such as addictions, family breakdown, and mental illness are compounded by structural forces such as lack of available low-cost housing, poor economic conditions, and insufficient mental health services. Together these factors impact levels of homelessness through their dynamic relations. Historic models, which are static in nature, have only been marginally successful in capturing these relationships.

Methods

Fuzzy Logic (FL) and fuzzy cognitive maps (FCMs) are particularly suited to the modeling of complex social problems, such as homelessness, due to their inherent ability to model intricate, interactive systems often described in vague conceptual terms and then organize them into a specific, concrete form (i.e., the FCM) which can be readily understood by social scientists and others. Using FL we converted information, taken from recently published, peer reviewed articles, for a select group of factors related to homelessness and then calculated the strength of influence (weights) for pairs of factors. We then used these weighted relationships in a FCM to test the effects of increasing or decreasing individual or groups of factors. Results of these trials were explainable according to current empirical knowledge related to homelessness.

Results

Prior graphic maps of homelessness have been of limited use due to the dynamic nature of the concepts related to homelessness. The FCM technique captures greater degrees of dynamism and complexity than static models, allowing relevant concepts to be manipulated and interacted. This, in turn, allows for a much more realistic picture of homelessness. Through network analysis of the FCM we determined that Education exerts the greatest force in the model and hence impacts the dynamism and complexity of a social problem such as homelessness.

Conclusions

The FCM built to model the complex social system of homelessness reasonably represented reality for the sample scenarios created. This confirmed that the model worked and that a search of peer reviewed, academic literature is a reasonable foundation upon which to build the model. Further, it was determined that the direction and strengths of relationships between concepts included in this map are a reasonable approximation of their action in reality. However, dynamic models are not without their limitations and must be acknowledged as inherently exploratory.

Document type: 
Article
File(s):