Quantitative Biology: Model and Inference
(定量生物学:模型与推断学术研讨会)
一.日程表
5月12号(周六)上午8:45-11:30
时间 | 事项 | 地点 |
8:45-9:00 | 开幕式、合影 | 海韵园实验楼105 |
9:00-9:30 | 邓明华(北京大学):成分数据网络推断及其在宏基因组数据中的应用 | 主持人 胡杰 |
9:30-10:00 | 樊晓丹(香港中文大学):Several Computational Problems in Comparative Methylation Analyses |
10:00-10:30 | 茶歇 |
10:30-11:00 | 翟巍巍(新加坡基因组研究所):Mutational mapping: a versatile approach exploring adaptive evolution over large phylogenies | 主持人 蒋达权 |
11:00-11:30 | 雷锦誌(清华大学): 内皮血管迁移行为的数学建模 |
5月12号(周六)下午14:30-17:00
14:30-15:00 | 张兴安(华中师范大学):Mutation mechanism of human breast cancer | 主持人 周达 | 海韵园实验楼105 |
15:00-15:30 | 张家军(中山大学):基于单细胞转录组数据的基因转录动力学推断 |
15:30-16:00 | 茶歇 |
16:00-16:30 | 尹建鑫(中国人民大学):A fusion penalized logistic threshold regression model with application to diabetes prediction | 主持人 程璟 |
16:30-17:00 | 王颖(hg8868官方网站):Comparing and Analyzing Microbial Communities on Metagenomic Data with Alignment-free Model |
5月13号(周天)上午8:30-11:30
时间 | 事项 | 地点 |
8:30-9:00 | 刘跃武(湖南农业大学): Mathematical modeling of HIV-1 Gag trafficking, polymerization and assembly | 主持人 陈坤 | 海韵园实验楼105 |
9:00-9:30 | 蔡敬衡(中山大学): An Additive--Multiplicative Mean Residual Life Model for Right Censored Data |
9:30-10:00 | 潘灯(华中科技大学):Bayesian proportional hazards model with latent variables |
10:00-10:30 | 茶歇 |
10:30-11:00 | 何海金(深圳大学):Additive mean residual life model with latent variables under right censoring | 主持人 蔡敬衡 |
11:00-11:30 | 蒋杭进(香港中文大学): Revealing Free Energy Landscape of Proteins |
|
|
|
|
|
二.学术报告题目与摘要
成分数据网络推断及其在宏基因组数据中的应用
邓明华(北京大学)
Abstract: 本次报告中,我将介绍课题组在成分数据分析中的几个工作,包括成分数据相关网络即相关矩阵的估计方法CCLasso, 条件相关网络即精度矩阵估计方法gCoda和CDTrace方法。对于宏基因组测序数据,物种的丰度只有相对量有意义,因此是成分数据。在成分数据的相关分析中,忽略成分的影响会导致错误的结论。我们在稀疏假设下引入不同的惩罚函数来给出了相应的估计。
Several Computational Problems in Comparative Methylation Analyses
樊晓丹(香港中文大学)
Abstract: Methylation may be one key factor for cell differentiation. It is the most widely measured epigenetic modifications. Many efforts have been devoted to compare the methylation profiles of different phenotypes as an effort to probe the disease mechanism. However, principled methods are still needed to handle the cell purity problem and the missing pair problem while efficiently integrating methylation array or sequencing data. We show that Bayesian methods can effectively solve these problems.
Mutational mapping: a versatile approach exploring adaptive evolution over large phylogenies
翟巍巍(新加坡基因组研究所)
Abstract: The classical method for detecting adaptive evolution is the codon based likelihood models. However, the computational burden of these models is often very high. Mutational mapping offers a powerful alternative approach exploring sequence evolution over large phylogenies. In this presentation, I will illustrate to you how we can use this method to analyze large amount of influenza datasets and understand adaptive evolution in the Influenza virus.
内皮血管迁移行为的数学建模
雷锦誌(清华大学)
摘要:血管新生是再生医学和创伤恢复的重要问题。干细胞介导的血管内皮细胞迁移是血管新生的重要环节,但是其机制并不清楚。本报告将介绍通过实验与数学模型相结合的方法,对血管内皮细胞的迁移机制进行研究,通过结合实验结果并引进合适的假设建立书序模型,我们可以很好的解释血管内皮细胞迁移的动态过程。这一工作为更好的理解干细胞在促血管再生过程的作用和相关的分子机制。
Mutation mechanism of human breast cancer
张兴安(华中师范大学)
Abstract: In this talk I will introduce that the driver mutations of human breast cancer should be 2-14 hits and the mutation of stability genes must be occurred by the first three hits. My result means that the mutation network of breast stem cells is composed by at least two genes and at most 14 genes.
基于单细胞转录组数据的基因转录动力学推断
张家军(中山大学)
A fusion penalized logistic threshold regression model with application to diabetes prediction
尹建鑫(中国人民大学)
Abstract: In many real problems, explanatory variables affect the response variable nonlinearly. A useful model is threshold regression model, namely, adjusting for the effect of other variables, when the level of certain variable exceed some threshold, then it will cause a change in the response. And this framework can be generalized to multiple levels of affection. We study such model for logistic regression and develop an algorithm that employ the decision tree to determine the threshold cut point and then fit a fusion penalized logistic regression model. Some theoretical guarantees are obtained under regular conditions. When this model is applied to a routine-body-examination data, a real requirement is to get a score model for the diabetes prediction or alert only based on the routine body examinations data. In this study, each continuous variable is split into multiple discrete levels, and the coefficients hence the scores should be closer for adjacent levels. A fused lasso model is applied on the thresholded data to get the score for the categorical levels of the explanatory variables. Corresponding theoretical properties are also obtained. Simulation study and real data analysis all show our model's good performance.
Comparing and Analyzing Microbial Communities on Metagenomic Data with Alignment-free Model
王颖(hg8868官方网站)
Abstract: Metagenome comparisons based on large amounts of next generation sequencing data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads. Alignment-free approaches based on the counts of word patterns in NGS data do not depend on the reference sequences and are generally computationally efficient. We applied statistical model on short k-mer(2-10bp) to unsupervised comparison of metagenomic samples and long k-mer (>=20bp) features to identify the group-specific sequences between different groups.
Mathematical modeling of HIV-1 Gag trafficking, polymerization and assembly
刘跃武(湖南农业大学),邹秀芬(武汉大学)
Abstract: In this talk, I introducethree mathematical models for describing HIV-1 Gag transport and polymerization, HIV-1 like particles assembly in vitro and HIV-1 immature capsids assembly in vivo. Through the theoretic analysis and quantitative analysis, the pathogenesis of HIV infection is revealed and potential drug targets are predicted.
An Additive--Multiplicative Mean Residual Life Model for Right Censored Data
蔡敬衡(中山大学)
Abstract: Many studies have focused on determining the effect of body mass index (BMI) on the mortality in different cohorts. In this article, we propose an additive-multiplicative mean residual life (MRL) model to assess the effects of BMI and other risk factors on the MRL function of survival time in a cohort of Chinese type 2 diabetic patients. The proposed model can simultaneously manage additive and multiplicative risk factors and provide a comprehensible interpretation of their effects on the MRL function of interest. We develop an estimation procedure through pseudo partial score equations to obtain parameter estimates. We establish the asymptotic properties of the proposed estimators and conduct simulations to demonstrate the performance of the proposed method. The application of the procedure to a study on the life expectancy of type 2 diabetic patients reveals new insights into the extension of the life expectancy of such patients.
Bayesian proportional hazards model with latent variables
潘灯(华中科技大学)
Abstract:We consider a joint modeling approach that incorporates latent variables into a proportional hazards model to examine the observed and latent risk factors of the failure time of interest. An exploratory factor analysis model is used to characterize the latent risk factors through multiple observed variables. In commonly used confirmatory factor analysis, the number of latent variables and their observed indicators are specified prior to analysis. By contrast, the exploratory factor analysis model allows such information to be fully determined by the data. A Bayesian approach coupled with efficient sampling methods is developed to conduct statistical inference, and the performance of the proposed methodology is confirmed through simulations. The model is applied to a study on the risk factors of chronic kidney disease for patients with type 2 diabetes.
Additive mean residual life model with latent variables under right censoring
何海金(深圳大学)
Abstract:We propose a novel additive mean residual life model to examine the effects of observable and latent risk factors on the mean residual life function of interest in the presence of right censoring. We use factor analysis to characterize the latent risk factors on the basis of multiple observed variables. We develop a borrow-strength estimation procedure that incorporates an asymptotically distribution-free generalized least square method and a corrected estimating equation approach. We establish the asymptotic properties of the proposed estimators. We develop a goodness-of-fit test for model checking. We report on simulations to evaluate the finite sample performance of the method. The application to a study on chronic kidney disease for type 2 diabetic patients reveals insights into the prevention of such common diabetic complications.
Revealing Free Energy Landscape of Proteins
蒋杭进(香港中文大学)
Abstract:Free energy landscape of proteins is very important to understand the function of proteins. In the talk, I will introduce our new method for estimating the number of stable states and the corresponding stable structures of proteins based on data from molecular dynamical simulation. We show through a benchmark example that our method is very powerful and has a great advantage over other existing methods. Furthermore, we apply the proposed method to HP35 NLE/NLE, a very important protein and find 6 stable states. By comparing with results from other methods, we show that our findings are much more reasonable.