论文外文文献-_第1页
论文外文文献-_第2页
论文外文文献-_第3页
论文外文文献-_第4页
论文外文文献-_第5页
已阅读5页,还剩3页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、河北地质大学本科生毕业设计(论文外文翻译题目基于西瓜预测的多种线性模型算法探究姓名张则君专业软件工程指导教师汪慎文2016年5月28日英文文献:A logistic regression model for Semantic Webservice matchmakingAbstractSemantic Web service matchmaking, as one of the most challenging problems in Semantic Web services (SWS, aims to filter and rank a set of services with respe

2、ct to a service query by using a certain matching strategy. In this paper, we propose a logistic regression based method to aggregate several matching strategies instead of a fixed integration (e.g., the weighted sum for SWS matchmaking. The logistic regression model is trained on training data deri

3、ved from binary relevance assessments of existing test collections, and then used to predict the probability of relevance between a new pair of query and service according to their matching values obtained from various matching strategies. Services are then ranked according to the probabilities of r

4、elevance with respect to each query. Our method is evaluated on two main test collections, SAWSDL-TC2 and Jena Geography Dataset(JGD. Experimental results show that the logistic regression model can effectively predict the relevance between a query and a service, and hence can improve the effectiven

5、ess of service matchmaking.Keywords:Semantic Web service, matchmaking, logistic regression1IntroductionSemantic Web services (SWS, as an application of the ideas of the Semantic Web to the service oriented computing, has attracted much attention recently 1. SWS matchmaking is one of the most challen

6、ging problems in SWS 2, which aims to filter and rank a set of services with respect to a query by using a certain matching strategy that measures the similarity between a query and a service. A variety of competing matching strategies have been proposed recently 3,4, among which integrated matching

7、 strategies that combine the matching results obtained from different matching strategies have been shown to be promising according to the intensive comparisons from various service matchmaking contests1. Integration provides a comprehensive and complementary way to measure the similarity between a

8、query and a service by considering different descriptions of Web services. Thus, how to effectively integrate individual similarity values obtained from useful matching strategies into an overall score becomes an important issue.An intuitive integration way is to use empirical values as the weights

9、of differentmatching strategies. For example, URBE 5 uses weighted sum to integrate several similarity values into an overall score. However, these empirical weights are difficult to be predicted correctly in practice, due to the various characteristics of applications. To alleviate this problem, se

10、veral machine learning based methods have been used to learn these weights for service discovery. Christopher al. 6 proposed the SWS matchmaker iMatcher which integrates various text similarity measures usingdierent machine learning algorithms. Klusch et al. 7 also proposed the SAWSDL service matchm

11、aker SAWSDL-MX2 that integrates three matching variants using support vector machine (SVM, including logic-based, text similarity based matching of semantic annotations, and structural matching.The logistic regression model is a popular model for binary data prediction, regression and classification

12、 8, and it has been successfully applied in several applications such as text retrieval 9. Essentially, the service matchmaking problem can be viewed as a binary data prediction problem of judging whether a service is relevant to a query or not. In addition, the logistic regression provides a normal

13、 way to analyze the contribution of each matching strategy to service matchmaking in a specific domain according to the estimates of the coefficients, which is of practical help for domain experts to select appropriate matching strategies for their specific applications. Based on this insight, in th

14、is paper, we propose a method that exploits the logistic regression model to integrate various matching strategies and to predict the probability of relevance between a query and a service based on their individual matching scores. Following our previous work 10,11, we adopt several matching strateg

15、ies to compute the individual similarity values, and then integrate them into an overall similarity using the trained logistic regression model. Experimental results show the logistic regression model outperforms all basic matching strategies in terms of recall and precision, and also outperforms th

16、e well-known integrated matchmakers. 2Experimental resultsIn this evaluation, we use two test collections from Semantic Service Selection (S3 contest SAWSDL-TC2 and Jena Geography Dataset (JGD. Each test collection is represented by a set of vectors with cardinality |Q| × |P | in matrix S, in w

17、hich Q and P represent the sets of queries and services respectively in the test collection. Each row in matrix S corresponds to a sample, which represents the similarity values of a pair of query and service vs. the matching strategies respectively. The set of samples is divided into |Q| folds, and

18、 each fold consists of all the samples related to a query. Each time, we take one fold as test set (related to one query and learn the logistic regression model on the remaining |Q 1| folds, and then measure the eectiveness on the test query. Finally, the macro-average of the results of the |Q | run

19、s is considered as the performance of the statistical model based matching strategies on the whole test collection. This approach follows the standard N -fold cross validation in machine learning. To show the performance of our method, in this paper, we also implement other machine learning based ma

20、tchmaking methods based on the same matching strategies by using WEKA 12, such as -SVR, linear regression, J48 decision tree, Adaboosting based J48, etc.In addition, we also compare our method with the well-known SVM basedmatchmaker SAWSDL-MX2 that integrates dierent matching strategies from those u

21、sed in this paper. The mean average precision of our method is 0.749 on SAWSDL-TC2 and 0.67 on JGD, while the MAP of SAWSDL-TC2 is 0.679 on SAWSDL-TC2 and 0.45 on JGD.In summary, our logistic regression model can effectively integrate the commonly used matching strategies shown in Table 1, and also

22、improve theeff ectiveness of service matchmaking by learning from others strong points to offset ones weaknesses. It also indicates that selecting proper basic matching strategies is very important to integrated service matchmaking, since each matching strategy may contribute differently in service

23、matchmaking. This is another advantage of our method, since logistic regression can help us to select proper matching strategies according to the estimates of the co-effcients.3ConclusionsThis paper proposes a novel method for Semantic Web service matchmaking, which employs logistic regression to ag

24、gregate multi-results obtained from several basic matching strategies into an overall similarity value. Experiments show that the logistic regression model is able to provide an overall and almost overwhelming per performance. We can, therefore, conclude that the logistic regression model used in th

25、is paper is effective and appropriate for integrating individual similarity values obtained from various matching strategies on different description components.中文翻译:基于语义服务匹配的对数几率回归模型摘要语义服务匹配是语义服务网站的最具有挑战性的问题,语义服务网站目的是用某种匹配策略去筛选和划分关于网络查询的语义。本文中,我们对语义服务匹配提出来了基于对数几率模型去整合一些匹配策略而不是像加权求和的一个固定整合。对数几率回归模型是

26、在训练集上来自于自二进制相关评估的现有的测试集合的二相关评定,然后用于预测的概率之间的相关性的新的对查询和服务,根据其匹配的值从不同的匹配策略。服务的排名,然后根据每个查询的相关性的概率。我们的方法是评估两个主要的测试集,sawsdl-tc2和耶拿地理数据集(JGD。实验结果表明,对数几率回归模型可以有效预测查询和服务之间的关联性,因此可以提高服务匹配的效率。关键字:语义服务网站;匹配;对数回归1 简介 语义服务, 作为基于面向计算的服务语义网站的思想应用,最近引起了很大 的注意。SWS匹配是最具挑战的问题之一,旨在采用某种计算关于查询和服务之 间的相似性匹配策略去筛选和排序的一组服务。 各种

27、竞争匹配策略最近已经被提 出了, 根据各种网络匹配竞赛的广泛比较,其中从不同的匹配策略获得的结合匹 配结果的整合匹配策略, 已经证明具有前景性。集成提供了一个全面的和互补的 方式来衡量查询和考虑二者不同描述Web服务的服务之间的相似性。因此,如何 从有用的匹配策略获得有效整合个体相似度值使其成为一个整体的成绩成为一 个重要的问题。 一个直观的集成方法是使用经验值作为不同匹配策略的权重。例如,URBE 采用加权求和将多个相似值加权为一个总分。然而,由于应用的不同特点,这些 权重在实践中很难被准确的预测。为了缓解这个问题,几个以机器学习为基础的 方法已经被用来学习用于服务发现的这些权重。克里斯托夫

28、等提出的SWS匹配器 是用不同的机器学习算法集成多种不同的文本相似性措施编写的。 klusch等人提 出了整合了三种匹配变体的SAWSDL服务集成器SAWSDL-MX2使用了支持向量机 (SVM)的变种,包括基于逻辑,基于语义匹配的文本的相似性和结构匹配。 对数几率回归模型是一种流行的二值数据预测、回归和分类的模型,它已成 功地应用于像文本检索的多个应用中。从本质上讲,服务匹配的问题可以被看作 是一个二值数据的预测问题,判断服务与查询相关或不相关。此外,对数几率回 归提供了一个一般的方式, 根据系数的估计来分析每一个关于在某一领域服务匹 配的策略在一个特定的域服务匹配的贡献, 这是某一领域的专

29、家为了某种应用用 于寻找合适的匹配策略的有效帮助。基于这种认识,在本文中,我们提出了一种 方法, 利用逻辑回归模型, 整合各种匹配策略并且预测关于它们基于个体匹配成 绩的在查询与服务之间的可能性。在我们以前的工作中,我们采用多种匹配策略 来计算个体相似度的值, 然后将它们集成经过训练的对数几率回归模型的整体相 似度。 实验结果表明,对数几率回归模型优于所有基本匹配策略在查全率和查准 率方面,也优于众所周知的的整合匹配器。 一般而言,查询的相关服务远小于所做广告的服务,因此,数量不相关的查 询和服务远远大于相关的查询和服务数量。 这种不平衡的训练数据集基于传统的 机器学习方法会一个更大的类中导致偏向。为了克服这个问题,成本敏感模型是 通过确定每个样本的惩罚项来发展起来的。 我们的目标是使用学习的模型去预测 服务与查询相关的概率,根据这些概率,匹配排名服务。通常情况下,用户希望 在排名榜的顶部找到他们想要的服务,而不关心是否返回所有相关的服务。从这 个角度来看, 一个正例被预测为反例在这项工作中花费敏感代价认为真正的分类 有比反例预测为正例有更严重的后果。因此,在分类查询和服务无关设置为40 倍的在相

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论