Computing Science, School of

Receive updates for this collection

Creating Groups with Similar Expected Behavioural Response in Randomized Controlled Trials: A Fuzzy Cognitive Map Approach

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Background

Controlling bias is key to successful randomized controlled trials for behaviour change. Bias can be generated at multiple points during a study, for example, when participants are allocated to different groups. Several methods of allocations exist to randomly distribute participants over the groups such that their prognostic factors (e.g., socio-demographic variables) are similar, in an effort to keep participants’ outcomes comparable at baseline. Since it is challenging to create such groups when all prognostic factors are taken together, these factors are often balanced in isolation or only the ones deemed most relevant are balanced. However, the complex interactions among prognostic factors may lead to a poor estimate of behaviour, causing unbalanced groups at baseline, which may introduce accidental bias.

Methods

We present a novel computational approach for allocating participants to different groups. Our approach automatically uses participants’ experiences to model (the interactions among) their prognostic factors and infer how their behaviour is expected to change under a given intervention. Participants are then allocated based on their inferred behaviour rather than on selected prognostic factors.

Results

In order to assess the potential of our approach, we collected two datasets regarding the behaviour of participants (n = 430 and n = 187). The potential of the approach on larger sample sizes was examined using synthetic data. All three datasets highlighted that our approach could lead to groups with similar expected behavioural changes.

Conclusions

The computational approach proposed here can complement existing statistical approaches when behaviours involve numerous complex relationships, and quantitative data is not readily available to model these relationships. The software implementing our approach and commonly used alternatives is provided at no charge to assist practitioners in the design of their own studies and to compare participants' allocations.

Document type: 
Article
File(s): 

Whole Genome Sequencing of Turkish Genomes Reveals Functional Private Alleles and Impact of Genetic Interactions with Europe, Asia and Africa

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2014
Abstract: 

Background

Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×).

Results

We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency.

Conclusion

This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey.

Document type: 
Article
File(s): 

Approaches to Non-Intrusive Load Monitoring (NILM) in the Home

Author: 
Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2012-10-24
Abstract: 

When designing and implementing an intelligent energy conservation system for the home, it is essential to have insight into the activities and actions of the occupants. In particular, it is important to understand what appliances are being used and when. In the computational sustainability research community this is known as load disaggregation or Non-Intrusive Load Monitoring (NILM). NILM is a foundational algorithm that can disaggregate a home’s power usage into the individual appliances that are running, identify energy conservation opportunities. This depth report will focus on NILM algorithms, their use and evaluation. We will examine and evaluate the anatomy of NILM, looking at techniques using load monitoring, event detection, feature ex- traction, classification, and accuracy measurement. 

Document type: 
Technical Report
File(s): 

Changing Risk Behaviours and the HIV Epidemic: A Mathematical Analysis in the Context of Treatment as Prevention

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

Expanding access to highly active antiretroviral therapy (HAART) has become an important approach to HIV prevention in recent years. Previous studies suggest that concomitant changes in risk behaviours may either help or hinder programs that use a Treatment as Prevention strategy.

Analysis

We consider HIV-related risk behaviour as a social contagion in a deterministic compartmental model, which treats risk behaviour and HIV infection as linked processes, where acquiring risk behaviour is a prerequisite for contracting HIV. The equilibrium behaviour of the model is analysed to determine epidemic outcomes under conditions of expanding HAART coverage along with risk behaviours that change with HAART coverage. We determined the potential impact of changes in risk behaviour on the outcomes of Treatment as Prevention strategies. Model results show that HIV incidence and prevalence decline only above threshold levels of HAART coverage, which depends strongly on risk behaviour parameter values. Expanding HAART coverage with simultaneous reduction in risk behaviour act synergistically to accelerate the drop in HIV incidence and prevalence. Above the thresholds, additional HAART coverage is always sufficient to reverse the impact of HAART optimism on incidence and prevalence. Applying the model to an HIV epidemic in Vancouver, Canada, showed no evidence of HAART optimism in that setting.

Conclusions

Our results suggest that Treatment as Prevention has significant potential for controlling the HIV epidemic once HAART coverage reaches a threshold. Furthermore, expanding HAART coverage combined with interventions targeting risk behaviours amplify the preventive impact, potentially driving the HIV epidemic to elimination.

Document type: 
Article
File(s): 

Video Game Telemetry as a Critical Tool in the Study of Complex Skill Learning

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Cognitive science has long shown interest in expertise, in part because prediction and control of expert development would have immense practical value. Most studies in this area investigate expertise by comparing experts with novices. The reliance on contrastive samples in studies of human expertise only yields deep insight into development where differences are important throughout skill acquisition. This reliance may be pernicious where the predictive importance of variables is not constant across levels of expertise. Before the development of sophisticated machine learning tools for data mining larger samples, and indeed, before such samples were available, it was difficult to test the implicit assumption of static variable importance in expertise development. To investigate if this reliance may have imposed critical restrictions on the understanding of complex skill development, we adopted an alternative method, the online acquisition of telemetry data from a common daily activity for many: video gaming. Using measures of cognitive-motor, attentional, and perceptual processing extracted from game data from 3360 Real-Time Strategy players at 7 different levels of expertise, we identified 12 variables relevant to expertise. We show that the static variable importance assumption is false - the predictive importance of these variables shifted as the levels of expertise increased - and, at least in our dataset, that a contrastive approach would have been misleading. The finding that variable importance is not static across levels of expertise suggests that large, diverse datasets of sustained cognitive-motor performance are crucial for an understanding of expertise in real-world contexts. We also identify plausible cognitive markers of expertise.

Document type: 
Article
File(s): 

Improvement and Performance Evaluation for Multimedia Files Transmission in Vehicle-Based DTNs

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

In recent years, P2P file sharing has been widely embraced and becomes the largest application of the Internet traffic. And the development of automobile industry has promoted a trend of deploying Peer-to-Peer (P2P) networks over vehicle ad hoc networks (VANETs) for mobile content distribution. Due to the high mobility of nodes, nodes’ limited radio transmission range and sparse distribution, VANETs are divided and links are interrupted intermittently. At this moment, VANETs may become Vehicle-based Delay Tolerant Network (VDTNs). Therefore, this work proposes an Optimal Fragmentation-based Multimedia Transmission scheme (OFMT) based on P2P lookup protocol in VDTNs, which can enable multimedia files to be sent to the receiver fast and reliably in wireless mobile P2P networks over VDTNs. In addition, a method of calculating the most suitable size of the fragment is provided, which is tested and verified in the simulation. And we also show that OFMT can defend a certain degree of DoS attack and senders can freely join and leave the wireless mobile P2P network. Simulation results demonstrate that the proposed scheme can significantly improve the performance of the file delivery rate and shorten the file delivery delay compared with the existing schemes.

Document type: 
Article
File(s): 

Barnacle: Detecting and Characterizing Tandem Duplications and Fusions in Transcriptome Assemblies

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers.

Results

We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets.

Conclusions

Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.

Document type: 
Article
File(s): 

Analyzing The Impact Of Social Factors On Homelessness: A Fuzzy Cognitive Map Approach

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

Background

The forces which affect homelessness are complex and often interactive in nature. Social forces such as addictions, family breakdown, and mental illness are compounded by structural forces such as lack of available low-cost housing, poor economic conditions, and insufficient mental health services. Together these factors impact levels of homelessness through their dynamic relations. Historic models, which are static in nature, have only been marginally successful in capturing these relationships.

Methods

Fuzzy Logic (FL) and fuzzy cognitive maps (FCMs) are particularly suited to the modeling of complex social problems, such as homelessness, due to their inherent ability to model intricate, interactive systems often described in vague conceptual terms and then organize them into a specific, concrete form (i.e., the FCM) which can be readily understood by social scientists and others. Using FL we converted information, taken from recently published, peer reviewed articles, for a select group of factors related to homelessness and then calculated the strength of influence (weights) for pairs of factors. We then used these weighted relationships in a FCM to test the effects of increasing or decreasing individual or groups of factors. Results of these trials were explainable according to current empirical knowledge related to homelessness.

Results

Prior graphic maps of homelessness have been of limited use due to the dynamic nature of the concepts related to homelessness. The FCM technique captures greater degrees of dynamism and complexity than static models, allowing relevant concepts to be manipulated and interacted. This, in turn, allows for a much more realistic picture of homelessness. Through network analysis of the FCM we determined that Education exerts the greatest force in the model and hence impacts the dynamism and complexity of a social problem such as homelessness.

Conclusions

The FCM built to model the complex social system of homelessness reasonably represented reality for the sample scenarios created. This confirmed that the model worked and that a search of peer reviewed, academic literature is a reasonable foundation upon which to build the model. Further, it was determined that the direction and strengths of relationships between concepts included in this map are a reasonable approximation of their action in reality. However, dynamic models are not without their limitations and must be acknowledged as inherently exploratory.

Document type: 
Article
File(s): 

deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Gene fusions created by somatic genomic rearrangements are known to play an important role in the onset and development of some cancers, such as lymphomas and sarcomas. RNA-Seq (whole transcriptome shotgun sequencing) is proving to be a useful tool for the discovery of novel gene fusions in cancer transcriptomes. However, algorithmic methods for the discovery of gene fusions using RNA-Seq data remain underdeveloped. We have developed deFuse, a novel computational method for fusion discovery in tumor RNA-Seq data. Unlike existing methods that use only unique best-hit alignments and consider only fusion boundaries at the ends of known exons, deFuse considers all alignments and all possible locations for fusion boundaries. As a result, deFuse is able to identify fusion sequences with demonstrably better sensitivity than previous approaches. To increase the specificity of our approach, we curated a list of 60 true positive and 61 true negative fusion sequences (as confirmed by RT-PCR), and have trained an adaboost classifier on 11 novel features of the sequence data. The resulting classifier has an estimated value of 0.91 for the area under the ROC curve. We have used deFuse to discover gene fusions in 40 ovarian tumor samples, one ovarian cancer cell line, and three sarcoma samples. We report herein the first gene fusions discovered in ovarian cancer. We conclude that gene fusions are not infrequent events in ovarian cancer and that these events have the potential to substantially alter the expression patterns of the genes involved; gene fusions should therefore be considered in efforts to comprehensively characterize the mutational profiles of ovarian cancer transcriptomes.

Document type: 
Article
File(s): 

Module Discovery by Exhaustive Search for Densely Connected, Co-Expressed Regions in Biomolecular Interaction Networks

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

Background

Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented.

Methodology/Principal Findings

We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples.

Conclusion/Significance

We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets.

Document type: 
Article
File(s):