It is of significance for splice site prediction to develop novel algorithms that combine the sequence patterns of regulatory elements such as enhancers and silencers with the patterns of splicing signals.In this paper,a statistical model of splicing signals was built based on the entropy density profile(EDP) method,weight array method(WAM) and κ test;moreover,the model of splicing regulatory elements was developed by an unsupervised self-learning method to detect motifs associated with regulatory elements.With two models incorporated,a multi-level support vector machine(SVM) system was de-vised to perform ab initio prediction for splice sites originating from DNA sequence in eukaryotic ge-nome.Results of large scale tests on human genomic splice sites show that the new method achieves a comparative high performance in splice site prediction.The method is demonstrated to be with at least the same level of performance and usually better performance than the existing SpliceScan method based on modeling regulatory elements,and shown to have higher accuracies than the traditional methods with modeling splicing signals such as the GeneSplicer.In particular,the method has evident advantage over splice site prediction for the genes with lower GC content.
By combining the knowledge from KEGG database and literatures, we construct a comprehensive gene network for human mitochondria. The network comprises 2442 genes of 9 functional categories, including metabolism, development, immune, and apoptosis, etc. Topological analysis reveals that the network is scale free. The hubs of high degrees are mostly the genes of apoptosis. Three big modules are found in the network, which represent development and cell growth, metabolism, immune and apoptosis respectively, suggesting the multiplicity of functions and the complexity of gene regulation in mitochondria.