Invited Speakers

Nikhil R. Pal

Electronics and Communications Science Unit (ECSU)
Indian Statistical Institute
Editor-in-chief, IEEE Trans. Fuzzy Systems

TOPIC On Finding Short Structural Building Blocks for Constructing 3D structure of Proteins
ABSTRACT In this talk, first, we shall present a simple yet effective method called Structural Mountain Clustering Method (SMCM) for finding a small library of short structural motifs that can be used to construct 3D structures of unknown proteins. To reduce the computational overhead associated with this method, an incremental version of the SMCM will then be proposed. We shall demonstrate using two databases that the proposed SMCM and its incremental variant are equally effective and are better than a competitive algorithm (which inspired our investigation) in terms of local and global reconstruction error. Then we shall adapt the Self-organizing Map (SOM) neural network so that it can also find building blocks or short structural motifs. Since SOM has a density matching property, the structural motifs associated with SOM are expected to be very effective for the present problem. We call this algorithm Structural Self-organizing Map (SSOM). This SSOM leads to a few other variants of the algorithm. The effectiveness of the SOM based algorithms will also be demonstrated. The talk will be concluded with a discussion on the issues that are left to be investigated in this context.

Chun-Nan Hsu

Research Fellew
Institute of Information Science
Academia Sinica, Taiwan

TOPIC Integrating High Dimensional Bi-directional Parsing Models for Gene Mention Tagging
ABSTRACT Motivation: Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this paper, we describe in details our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. Results: We first describe in details how we developed our CRFbased tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bidirectional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also give backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bidirectional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Availability: Data sets, programs and an on-line service of our gene mention tagger can be accessed at

Yu-Ju Chen

Associate Research Fellow
Institute of Chemistry,
Genomics Research Center,
Academia Sinica, Taiwan

TOPIC New strategies Towards Rapid and Comprehensive Proteomic Signatures for Biomedical Applications
ABSTRACT In the post-genome era, the field of proteomics promises to identify altered abundances or structures of tissue or fluid markers associated with human diseases. The discovery and utility of biomarkers may provide earlier diagnosis and improved therapeutic intervention. Here, I will present our newly developed proteomic strategies for comprehensive and quantitative profiling of phosphoproteome and membrane proteome. Towards multiplexed, comprehensive and robust quantitation of the membrane proteome, we developed a strategy combining gel-assisted digestion, iTRAQ labeling, and LC-MS/MS. Quantitation of four independently purified membrane fractions from HeLa cells gave high accuracy (< 8% error) and precision (< 12% RSD). Most remarkably, topological analysis revealed that the biggest improvement was achieved in detection of transmembrane peptides from integral membrane proteins with up to 19 transmembrane helices. To the best of our knowledge, this level of coverage exceeds that previously achieved using MS and provides superior quantitation accuracy compared with other methods. We applied this approach to the first proteomic delineation of phenotypic expression in a mouse model of autosomal-dominant polycystic kidney disease (ADPKD). The result demonstrates how comparative membrane proteomics can provide insight into the molecular mechanisms underlying ADPKD and the identification of potential drug targets. Abnormal protein phosphorylation has been reported to be crucial in cancer metastasis. The second part of presentation will focus on a label-free quantification strategy for large-scale quantification of phosphoproteome. The performance of the new approach will be demonstrated on a lung cancer metastasis model. By quantitative analysis of lung adenocarcinoma cell lines with varying degrees of invasiveness, a total of 1231 phosphoproteins were identified; a significant number of proteins (838) were found to be differentially phosphorylated in metastatic lung cancer cells. Some of these constituently phosphorylated and over-activated protein kinases have been known to involve in lung adenocarcinoma, including ERK/MAPK signaling pathway and cell migration.

Hsuan-Cheng Huang

Associate Professor
Institute of Biomedical Informatics
Center for Systems and Synthetic Biology
National Yang Ming University, Taiwan

TOPIC MicroRNA Regulation in Protein Interaction Network
ABSTRACT Protein-protein interactions are critical to most biological processes. Available high-throughput experiments on protein-protein interactions allow us to build the interaction network giving more insight. MicroRNAs regulate the protein encoding genes at the post-transcriptional level. However, the relationship between protein-protein interaction network and microRNA regulation is still not clear. We have performed topological analysis to elucidate the global correlation between microRNA regulation and protein-protein interaction network in human. The analysis showed that target genes of individual microRNAs tend to be hubs and bottlenecks in the network. While proteins directly regulated by a microRNA might not form a network module themselves, the microRNA-target genes and their interacting neighbors jointly showed significantly higher density and modularity. Our findings shed light on how microRNA may regulate the protein interaction network.

Hsueh-Fen Juan

Associate Professor
Department of Life Science, Institute of Molecular & Cellular Biology
Institute of Biomedical Electronics & Bioinformatics
Center for Systems Biology and Bioinformatics
National Taiwan University, Taiwan

TOPIC Anti-tumor Activity of Reishi Polysaccharides: from Gene Expression to Network
ABSTRACT Ganoderma lucidum (Reishi) has been widely used as a herbal medicine for promoting health and longevity in China and other Asian countries. Polysaccharide extracts from Reishi have been reported to exhibit immuno-modulating and anti-tumor activities. In previous studies, F3, the active component of the polysaccharide extract, was found to activate various cytokines such as IL-1, IL-6, IL-12, and TNF-α. This gave rise to our investigation on how F3 stimulates anti-tumor effects in human leukemia THP-1 cells. Here, we integrated time-course DNA microarray analysis, quantitative PCR assays, and bioinformatics methods to study the F3-induced effects in THP-1 cells. Significantly disturbed pathways induced by F3 were identified with statistical analysis on microarray data. The apoptosis induction through the DR3 and DR4/5 death receptors was found to be one of the most significant pathways and play a key role in THP-1 cells after F3 treatment. Based on time-course gene expression measurements of the identified pathway, we reconstructed a plausible regulatory network of the involved genes using reverse-engineering computational approach. Our results showed that F3 may induce death receptor ligands to initiate signaling via receptor oligomerization, recruitment of specialized adaptor proteins and activation of caspase cascades.

Yen-Jen Oyang

Department of Computer Science and Information Engineering
Institute of Biomedical Electronics and Bioinformatics
Director of Center for Systems Biology and Bioinformatics
National Taiwan University, Taiwan

TOPIC Alternative Machine Learning Algorithms for Bioinformatics Applications
ABSTRACT In this presentation, we will describe the basic concepts of alternative machine learning algorithms that are based on kernel functions. The alternative algorithms addressed include (1) the support vector machine, (2) the regularization network, and (3) the kernel density estimation based algorithm. We then will discuss their main characteristics and the effects of these alternative algorithms when applied to deal with bioinformatics problems. The main characteristics and effects addressed include execution time, sensitivity of parameter settings, sensitivity of feature selection, and accuracy. Our discussion will show that no single algorithm is superior to rival algorithms in all aspects.

Zemin Yao

Director, Molecular and Cellular Biology Lab,
University of Ottawa Heart Institute
Professor and Chair,
Department of Biochemistry,
Microbiology and Immunology,
University of Ottawa

TOPIC Unravelling Monogenic Dispositions in Complex Metabolic Disorders Associated with Familial Combined Hypertriglyceridemia
ABSTRACT Genome-wide scans in conjunction with genotype imputation and meta-analysis have been increasing used to identify genetic variants influencing plasma lipid concentrations that are associated with hypercholesterolemia, ypertriglyceridemia, type 2 diabetes, and premature coronary heart disease. Loci showed strong association with altered lipid metabolism have thus been confirmed, such as the APOA5-APOA4-APOC3-APOA1 cluster, APOB, CETP, LDLR, LPL, LIPC, and PCSK9. Genetic variations with low frequencies associated with rare lipid/lipoprotein disorders have also been identified, such as MTTP, ABCA1, AGPTA, and LPIN1 that link to abetalipoproteinemia, hypoalphalipoproteinemia, and lipodystrophy. Advancing the rich genetic information to the development of therapeutic strategies for various lipid/lipoprotein metabolic disorders requires deep understanding of pathophysiological mechanisms at cellular, molecular, and atomic levels, which remains a formidable challenge to life and clinical scientists in the post-genome era. Several examples to be presented exemplify the power of multidisciplinary approaches in unraveling monogenic dispositions associated with familiar combined hypertriglyceridemia, a metabolic lipid/lipoprotein disorder commonly viewed as a multifactorial disease.

Arthur Chun-Chieh Shih

Associate Research Fellow,
Institute of Information Science
Academia Sinica, Taiwan

TOPIC Computational Analysis of Human Influenza A Virus Evolution
ABSTRACT In circulating influenza viruses, antigenic drift is a major process of accumulating mutations at the antibody binding sites in hemagglutinin (HA) that evade recognition by host’s antibodies. Because such mutations in H3 HA occur often and new variants tend to replace older ones quickly, the evolution of the HA gene of H3 is much faster than that of other subtypes. Thus, it is important to know what kind of selection pressure operates on HA1 because it can enhance our understanding of influenza virus evolution as well as vaccine strain prediction. However, owing to methodological difficulties, the inference of positively selected amino acid residues in the HA1 domain of HA varies from study to study. To resolve controversies on whether only a few or many residue sites of HA1 have undergone positive selection, whether positive selection at HA1 is continual or punctuated, and whether antigenic change is punctuated, in this talk we introduce two different approach to analyze H3 HA1 sequences. In the result, we have identified many effective substitutions and each substitution occurred very rapidly. Most of the substitutions occurred on antigenic sites indicate that hitchhiking plays a minor role and most of these sites, many more than previously found, have undergone positive selection. Our results suggest that positive selection has been ongoing most of the time, not sporadic, and that multiple mutations at antigenic sites cumulatively enhance antigenic drift, indicating that antigenic change is less punctuated than recently proposed.

Ming-Jing Hwang

Research Fellow and Deputy Director ,
Institute of Biomedical Sciences
Academia Sinica, Taiwan

TOPIC A network approach to the protein docking problem
ABSTRACT We have applied network analysis to the protein docking problem with a novel scoring function derived based on network motifs of interactions between proteins and bound ligands. The scoring function thus derived is entirely non-energy-based and docking is instead scored by protein-ligand interaction motifs. The scoring function has been tested on 100 protein-ligand complex structures to assess its ability to identify near-native conformations from a set of decoys. In these complex structures, 84% of the highest-scored docking conformations have root-mean-square deviations (RMSDs) below 2.0 Å, which is comparable with the best of conventional energy-based docking scoring functions. Significantly, these interaction network motifs appear to be able to capture protein-ligand interactions beyond the pairwise, two-body interactions that are commonly modeled in conventional molecular mechanics force fields.

Hsien-Da Huang

Associate Professor,
Department of Biological Science & Technology
Institute of Bioinformatics and Systems Biology
National Chiao Tung University, Taiwan

TOPIC Bioinformatics Research in MicroRNA Regulation: Databases and Tools
ABSTRACT Recent works have demonstrated that microRNAs (miRNAs) are involved in critical biological processes by suppressing the translation of coding genes. In order to facilitate the investigation of microRNA regulation, several biological databases and computational tools were developed previously. In this talk, I will briefly introduce the following databases and tools for Bioinformatics research in miRNA regulation: (1) miRNAMap (Nucl Acids Res, 2006, Nucl Acids Res, 2008): an integrated resource to collect experimentally verified microRNAs and both known and putative miRNA target genes in human, mouse, rat and other metazoan genomes; (2) ViTa (Nucl Acids Res, 2007): a database of host microRNA targets on viruses; (3) miRStart: a resource to collect transcriptional start sites (promoters) of miRNAs; (4) miRNA Target Prediction: three computational tools, miRanda, RNAhybrid and TargetScan. (5) miRExpress: an effective tool to generate miRNA expression profiles from second-generation sequencing data. (6) RNALogo (Nucl Acids Res, 2008): a new display of structural RNA family.

Yu-Chuan Li

Professor and Chair,
Institute of Biomedical Informatics
National Yang Ming University, Taiwan
President of Asia Pacific Association for Medical Informatics (APAMI)

TOPIC Machine-learning Mechanisms in Prediction of Anti-psychotic Drug Response with Combined Pharmacogenetic and Clinical Data
ABSTRACT Innovative use of a set of computer algorithms called “machine learning” is promising in predicting individual drug response for patients taking anti-psychotic drugs. The new era of “Clinical Bioinformatics” has been promising people with a future of “Personalized Medicine” or “Individual-Based Medicine (IBM)” for quite a few years. However, there are still relatively few examples of personalized drugs currently used by clinicians. Although the clinical response of a new generation of targeted therapy drugs for cancer patients do depend partially on several SNP (Single Neucleotide Polymorphism), the lack of mathematically-clear relationship between the clinical response and the SNP still does not help much in selection of patients. We will present a study focusing on the prediction of clinical response for anti-psychotic drugs. Using a popular machine-learning mechanism call “Artifical Neural Networks”, we were able to predict individual patient’s clinical response based on several clinical observation and pharmacogenetic markers to an accuracy of 83.3%. This kind of studies demonstrates new possibilities in the future of personalized medicine. This will not only reduce the cost of trial-and-error medications, but it will also reduce the unnecessary side-effects and complications caused by ineffective therapy.