公共文化服务平台

利用癌功能类从癌基因组突变谱中识别癌基因: 2008年; 从大规模癌样本基因突变扫查数据中识别癌基因具有重要的意义.一些重要功能的改变对于癌的发生发展是必需的,因此将它们定义为癌功能类,并从GO(Gene Ontology)中选择一组显著富集已知癌基因的细致功能类来代表它们.为了评价以癌相关功能类作为特征识别癌基因的效果,将已知的蛋白激酶癌基因定义为阳性金标准,而将其他的蛋白激酶基因定义为阴性金标准.结果表明,与利用选择压力作为特征的方法比较,利用癌相关功能类作为特征的方法可以更有效地识别癌基因.进一步结合癌相关功能类与基因非同义突变个数可以产生更可靠的预测结果.最后,将46个注释到癌相关功能类并且其非同义突变个数至少为3的蛋白激酶基因预测为癌基因,预测精确率达到0.42.; 李彦辉郭政彭春芳刘庆马文财王靖姚晨张敏朱晶; 关键词：突变谱癌基因 GENE ONTOLOGY 蛋白质功能

Identifying cancer genes from cancer mutation profiles by cancer functions被引量：1: 2008年; It is of great importance to identify new cancer genes from the data of large scale genome screenings of gene mutations in cancers. Considering the alternations of some essential functions are indispensable for oncogenesis, we define them as cancer functions and select, as their approximations, a group of detailed functions in GO (Gene Ontology) highly enriched with known cancer genes. To evaluate the efficiency of using cancer functions as features to identify cancer genes, we define, in the screened genes, the known protein kinase cancer genes as gold standard positives and the other kinase genes as gold standard negatives. The results show that cancer associated functions are more efficient in identifying cancer genes than the selection pressure feature. Furthermore, combining cancer functions with the number of non-silent mutations can generate more reliable positive predictions. Finally, with precision 0.42, we suggest a list of 46 kinase genes as candidate cancer genes which are annotated to cancer functions and carry at least 3 non-silent mutations.; LI YanHui1, GUO Zheng1,2, PENG ChunFang2, LIU Qing2, MA WenCai2, WANG Jing2, YAO Chen2, ZHANG Min2 & ZHU Jing1 1 Bioinformatics Centre, School of Life Science, University of Electronic Science and Technology of China, Chengdu 610054, China; 关键词：MUTATION PROFILE CANCER GENE GENE ONTOLOGY GENE

Identifying disease feature genes based on cellular localized gene functional modules and regulation networks被引量：3: 2006年; Identifying disease-relevant genes and functional modules, based on gene expression pro- files and gene functional knowledge, is of high im- portance for studying disease mechanisms and sub- typing disease phenotypes. Using gene categories of biological process and cellular component in Gene Ontology, we propose an approach to selecting func- tional modules enriched with differentially expressed genes, and identifying the feature functional modules of high disease discriminating abilities. Using the differentially expressed genes in each feature module as the feature genes, we reveal the relevance of the modules to the studied diseases. Using three data- sets for prostate cancer, gastric cancer, and leukemia, we have demonstrated that the proposed modular approach is of high power in identifying functionally integrated feature gene subsets that are highly rele- vant to the disease mechanisms. Our analysis has also shown that the critical disease-relevant genes might be better recognized from the gene regulation network, which is constructed using the characterized functional modules, giving important clues to the concerted mechanisms of the modules responding to complex disease states. In addition, the proposed approach to selecting the disease-relevant genes byjointly considering the gene functional knowledge suggests a new way for precisely classifying disease samples with clear biological interpretations, which is critical for the clinical diagnosis and the elucidation of the pathogenic basis of complex diseases.; ZHANG MinZHU JingGUO ZhengLI XiaYANG DaWANG LeiRAO Shaoqi; 关键词：特征基因基因复制

Widely predicting specific protein functions based on protein-protein interaction data and gene expression profile被引量：3: 2007年; GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interac-tion data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automati-cally selects the most appropriate functional classes as specific as possible during the learning proc-ess, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organ-ized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.; GAO Lei1, LI Xia1,2, GUO Zheng1,2, ZHU MingZhu1, LI YanHui1 & RAO ShaoQi1,3 1 Department of Bioinformatics, Harbin Medical University, Harbin 150086, China; 关键词：GENE PROTEIN-PROTEIN GENE ONTOLOGY SIMILARITY GENE

利用亚细胞位置特异的基因功能模块与表达调控网络识别疾病特征基因被引量：3: 2006年; 利用GeneOntology中的生物过程(biologicalprocess)及细胞组分(cellularcomponent)两种分类体系,选择显著聚集差异表达基因的复合功能模块,识别其中能够有效分类疾病样本的特征功能模块,以特征功能模块中的差异表达基因作为特征并分析它们与疾病的相关性.对前列腺癌、胃癌和白血病数据的分析结果表明,基于特征功能模块的特征基因选择方法可以识别与疾病高度相关的功能一致的特征基因.进一步的分析显示,根据特征功能模块和基因表达调控信息构建基因表达调控网络,可以从中挖掘可能的疾病关键特征基因,并提示对复杂疾病同时应答的多功能模块间协同作用关系机理研究的重要线索.同时,本研究结合基因功能分类的疾病特征基因选择方法提示了一种高准确度的疾病分类方法,分类结果有明确的生物学意义,对复杂疾病的分子病理学研究亦有重要的意义.; 张敏朱晶郭政李霞杨达王磊饶绍奇; 关键词：基因表达谱特征基因基因调控网络

Functional modules with disease discrimination abilities for various cancers被引量：3: 2011年; Selecting differentially expressed genes(DEGs) is one of the most important tasks in microarray applications for studying multi-factor diseases including cancers.However,the small samples typically used in current microarray studies may only partially reflect the widely altered gene expressions in complex diseases,which would introduce low reproducibility of gene lists selected by statistical methods.Here,by analyzing seven cancer datasets,we showed that,in each cancer,a wide range of functional modules have altered gene expressions and thus have high disease classification abilities.The results also showed that seven modules are shared across diverse cancers,suggesting hints about the common mechanisms of cancers.Therefore,instead of relying on a few individual genes whose selection is hardly reproducible in current microarray experiments,we may use functional modules as functional signatures to study core mechanisms of cancers and build robust diagnostic classifiers.; YAO ChenZHANG MinZOU JinFengLI HongDongWANG DongZHU JingGUO Zheng

Oligo基因芯片的异常值处理对有监督疾病分类的影响: 2008年; 基因芯片实验产生的表达谱数据中存在大量不合格的检测点,对异常值的不同处理,对于有监督疾病分类结果的影响很大。针对此问题,在Oligo芯片数据中,在表达水平层面,通常对检测值做最大值和最小值的预处理后,进行后续分析。本研究选取了四套Oligo芯片数据集,采用不同限定芯片数据中最大值和最小值的方法,考察支持向量机、K近邻、决策树三种分类器对分类疾病样本效能的影响程度。结果显示:Dudoit等限定最大值和最小值分别为16000和100是一种合理的策略,可以达到很好的分类效果。同时发现对于小于100的检测值较多的数据集,采用限定最小值为10的策略同样能得到很好的分类效果,并可以为后续分析保留更多的原始数据。因此,合理限制Oligo芯片中的异常值,对于提高疾病分型是一种较好的策略。进一步采用功能表达谱方法,构造反映功能结点中全部注释基因的总体表达状态的均值或中值指标,利用构建的功能表达谱进行分类分析。发现不同异常值的限定方法对基于功能表达谱进行分类得到的准确率的影响较小,可以获得较稳定的分类结果。; 吕莹丽王栋郭政于梁梁李彦辉朱晶王晨光; 关键词：基因表达谱

cDNA芯片重复探针检测值的一致性分析: 2009年; 通过SOURCE数据库对4套cDNA数据的探针进行了注释,分析了对应同一条Unigene的多个探针的检测值(即重复检测值)之间的相关性。采用两种常规方法处理了重复检测值,比较了这两种处理方法对筛选差异表达基因的影响。结果显示:Unigene的重复检测值之间存在一定比例的负相关;更新探针注释数据后的重复检测值之间的低相关比例减少,高相关比例显著提高;重复点样探针之间的相关性高于其它重复检测值,但是仍有很多低相关;两种处理重复检测值方法对于用基因表达差异显著性分析方法(SAM)与T检验方法筛选差异表达基因影响不大。; 于梁梁王栋吕莹丽肖会杨强王晨光郭政李霞; 关键词：差异表达基因相关系数探针

根据蛋白质互作网络预测乳腺癌相关蛋白质的细致功能被引量：2: 2007年; 乳腺癌是最为常见的恶性肿瘤之一。已有的关于乳腺癌相关蛋白质的功能注释比较宽泛,制约了乳腺癌的后续研究工作。对于已知部分功能的乳腺癌相关蛋白质,提出了一种结合Gene Ontology功能先验知识和蛋白质互作的方法,通过构建功能特异的局部相互作用网络来预测乳腺癌相关蛋白质的细致功能。结果显示该方法能够以很高的精确率为乳腺癌相关蛋白质预测更为精细的功能。预测的相关蛋白质的功能对于指导实验研究乳腺癌的分子机制具有重要的价值。; 王靖李彦辉郭政朱晶马文财彭春方刘庆; 关键词：乳腺癌蛋白质功能基因本体

cDNA芯片缺失值处理对基于基因表达谱的疾病分类的影响被引量：3: 2006年; 选取了4套cDNA芯片数据,分别运用补零和K近邻的方法,对有检测缺失的基因进行了补缺失值处理,分析了不同处理对支持向量机、K近邻分类器、决策树三种分类器分类效能的影响.结果显示：在cDNA基因表达谱数据中,对检测缺失率不高于5%的基因补缺失值是一种较好的策略,这样可以保留较多的基因供后续的功能分析,同时仍然能够保持很高的疾病分类效能.; 王栋郭政李霞吕莹丽朱晶王晨光; 关键词：基因表达谱缺失值

渝B2-20050021-1　渝公网安备 50019002500403号　违法和不良信息举报中心　互联网出版许可证　新出网证(渝)字10号

国家自然科学基金(30370388)

文献类型

领域

主题

机构

作者

传媒

年份

用户反馈

国家自然科学基金(30370388)

文献类型

领域

主题

机构

作者

传媒

年份

用户登录

用户反馈