Computing Science, School of

Receive updates for this collection

Linearization of Ancestral Multichromosomal Genomes

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2012
Abstract: 

BACKGROUND:Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. This problem is in general intractable, and in particular if the ancestral genome is expected to contain only linear chromosomes or a unique circular chromosome. In the present work, we consider a relaxation of this problem, which allows ancestral genomes that can contain several chromosomes, each either linear or circular.RESULT:We show that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem. This result holds in the more general case where columns have bounded multiplicity, which models possibly duplicated ancestral genes. We also prove that for matrices with rows of degrees 2 and 3, without multiplicity and without weights on the rows, the problem is NP-complete, thus tracing sharp tractability boundaries.CONCLUSION:As it happened for the breakpoint median problem, also used in ancestral genome reconstruction, relaxing the definition of a genome turns an intractable problem into a tractable one. The relaxation is adapted to some biological contexts, such as bacterial genomes with several replicons, possibly partially assembled. Algorithms can also be used as heuristics for hard variants. More generally, this work opens a way to better understand linearization results for ancestral genome structure inference.

Document type: 
Article

smyRNA: A Novel Ab Initio ncRNA Gene Finder

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2009-05-05
Abstract: 

Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a hard task; in particular past attempts for ab initio ncRNA search mostly failed with the exception of tools that can identify micro RNAs.Methodology/Principal FindingsWe present a very general ab initio ncRNA gene finder that exploits differential distributions of sequence motifs between ncRNAs and background genome sequences.Conclusions/SignificanceOur method, once trained on a set of ncRNAs from a given species, can be applied to a genome sequences of other organisms to find not only ncRNAs homologous to those in the training set but also others that potentially belong to novel (and perhaps unknown) ncRNA families. Availability: http://compbio.cs.sfu.ca/taverna/smyrna

Document type: 
Article
File(s): 

The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2008
Abstract: 

The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding.  A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive,  high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.

Document type: 
Article

Conditional Random Fields and Supervised Learning in Automated Skin Lesion Diagnosis

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2011
Abstract: 

Many subproblems in automated skin lesion diagnosis (ASLD) canbe unified under a single generalization of assigning a label, from an predefinedset, to each pixel in an image. We first formalize this generalizationand then present two probabilistic models capable of solving it. The firstmodel is based on independent pixel labeling using maximum a-posteriori(MAP) estimation. The second model is based on conditional randomfields (CRFs), where dependencies between pixels are defined using agraph structure. Furthermore, we demonstrate how supervised learningand an appropriate training set can be used to automatically determineall model parameters. We evaluate both models' ability to segment achallenging dataset consisting of 116 images and compare our results to5 previously published methods.

Document type: 
Article

Improvement and Performance Evaluation for Multimedia Files Transmission in Vehicle-Based DTNs

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2013
Abstract: 

In recent years, P2P file sharing has been widely embraced and becomes the largest application of the Internet traffic. And thedevelopment of automobile industry has promoted a trend of deploying Peer-to-Peer (P2P) networks over vehicle ad hoc networks(VANETs) for mobile content distribution. Due to the high mobility of nodes, nodes’ limited radio transmission range and sparsedistribution, VANETs are divided and links are interrupted intermittently. At this moment, VANETs may become Vehicle-basedDelay Tolerant Network (VDTNs). Therefore, this work proposes an Optimal Fragmentation-based Multimedia Transmissionscheme (OFMT) based on P2P lookup protocol in VDTNs, which can enable multimedia files to be sent to the receiver fast andreliably in wireless mobile P2P networks over VDTNs. In addition, a method of calculating the most suitable size of the fragmentis provided, which is tested and verified in the simulation. And we also show that OFMT can defend a certain degree of DoS attackand senders can freely join and leave the wireless mobile P2P network. Simulation results demonstrate that the proposed schemecan significantly improve the performance of the file delivery rate and shorten the file delivery delay compared with the existingschemes.

Document type: 
Article

Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2007-09
Abstract: 

The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%–5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

Document type: 
Article
File(s): 

Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2007-07
Abstract: 

The (asymptotic) degree distributions of the best-known “scale-free” network models are all similar and are independent of the seed graph used; hence, it has been tempting to assume that networks generated by these models are generally similar. In this paper, we observe that several key topological features of such networks depend heavily on the specific model and the seed graph used. Furthermore, we show that starting with the “right” seed graph (typically a dense subgraph of the protein–protein interaction network analyzed), the duplication model captures many topological features of publicly available protein–protein interaction networks very well

Document type: 
Article
File(s): 

Ambient Data Collection with Wireless Sensor Networks

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

One of the most important applications for wireless sensor networks (WSNs) is Data Collection, where sensing data arecollected at sensor nodes and forwarded to a central base station for further processing. Since using battery powers and wirelesscommunications, sensor nodes can be very small and easily attached at specified locations without disturbing surroundingenvironments. This makes WSN a competitive approach for data collection comparing with its wired counterpart. In this paper,we review recent advances in this research area. We first highlight the special features of data collection WSNs, by comparingwith wired data collection network and other WSN applications. With these features in mind, we then discuss issues and priorsolutions on the data gathering protocol design. Our discussion also covers different approaches for message dissemination, whichis a critical component for network control and management and greatly affects the overall performance of a data collectionWSNsystem.

Document type: 
Article

Cooperative Coding and Caching for Streaming Data in Multihop Wireless Networks

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2010
Abstract: 

This paper studies the distributed caching managements for the current flourish of the streaming applications inmultihop wirelessnetworks. Many caching managements to date use randomized network coding approach, which provides an elegant solution forubiquitous data accesses in such systems. However, the encoding, essentially a combination operation, makes the coded datadifficult to be changed. In particular, to accommodate new data, the system may have to first decode all the combined datasegments, remove some unimportant ones, and then reencode the data segments again. This procedure is clearly expensivefor continuously evolving data storage. As such, we introduce a novel Cooperative Coding and Caching (C3) scheme, whichallows decoding-free data removal through a triangle-like codeword organization. Its decoding performance is very close to theconventional network coding with only a sublinear overhead. Our scheme offers a promising solution to the caching managementfor streaming data.

Document type: 
Article