With the rapid development of Web 2.0, more and more people are sharing their opinions about online products, so there is much product review data. However, it is difficult to compare products directly using ratings because many ratings are based on different scales or ratings are even missing. This paper addresses the following question: given textual reviews, how can we automatically determine the semantic orientations of reviewers and then rank different items? Due to the absence of ratings in many reviews, it is difficult to collect sufficient rating data for certain specific categories of products (e.g., movies), but it is easier to find rating data in another different but related category (e.g., books). We refer to this problem as transfer rating, and try to train a better ranking model for items in the interested category with the help of rating data from another related category. Specifically, we developed a ranking-oriented method called TRate for determining the semantic orientations and for ranking different items and formulated it in a regularized algorithm for rating knowledge transfer by bridging the two related categories via a shared latent semantic space. Tests on the Epinion dataset verified its effectiveness.
A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.
无线传感器网络是物联网(Internet of Things)的重要组成部分,利用其实现物联网中目标的定位技术已成为研究热点之一.由于受环境、障碍物、网络攻击和硬件错误等诸多因素的影响,传感器节点所采集的数据易产生较大误差,形成错误数据,从而对定位造成严重影响.尽管已发展出了众多定位算法和模型,但针对错误数据实现定位的研究还较罕见,尤其在国内,几乎是空白.文中针对上述问题,旨在利用网络(几何)拓扑结构信息,提出一种用局部信息刻画全局分布密度信息的新颖物联网定位模型:鲁棒的局部保持的典型相关分析定位模型LE-RLPCCA.与现有同类典型方法在真实环境中的实验结果相比,LE-RLPCCA具有更高的定位鲁棒性和稳定性.
Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.
A novel regularization method -- discriminative regularization (DR)is presented. The method provides a general way to incorporate the prior knowledge for the classification. By introducing the prior information into the regularization term, DR is used to minimize the empirical loss between the desired and actual outputs, as well as maximize the inter-class separability and minimize the intra-class compactness in the output space simultane- ously. Furthermore, by embedding equality constraints in the formulation, the solution of DR can solve a set of linear equations. Classification experiments show the superiority of the proposed DR.