




文档简介
The use of a genetic algorithm for clustering the weighing station performance in transportation A case study Abbas Mahmoudabadi a,b, Reza Tavakkoli-Moghaddamc, aDepartment of Industrial Engineering, Payam-e-Noor University, Tehran, Iran b General Director of Traffi c Safety Department, Road Maintenance and Transportation Organization, Tehran, Iran cDepartment of Industrial Engineering and Center of Excellence for Intelligence - Based Experimental Mechanics, College of Engineering, University of Tehran, Tehran, Iran a r t i c l ei n f o Keywords: Genetic algorithm Clustering Weighing station Transportation Performance analysis a b s t r a c t In this paper, a genetic algorithm (GA) is developed to solve a clustering problem for evaluating and rank- ing the weighing stations according to their performances. In hierarchical steps of clustering, observa- tions with the least similarities should be merged and some of them will be lost. To improve this defect, the main concept behind the proposed algorithm is to avoid losing data in the hierarchical process of clustering, so all of the observations are randomly assigned into a predefi ned number of clusters by GA procedures. In this model, we consider the performance factors related to the weighing operation, such as the traffi c volume of trucks, detected overloading, type of portable or fi xed scales, and rate of acceding detections compared to the same duration in the previous year. The required data of 126 weighing sta- tions are collected during two 6-month periods. Different dimensions of the collected data are standard- ized to uniform dimensions. The main performance of a clustering method considered as the fi tness value in a genetic algorithm (GA) is to maximize the sum of deviation squares from the mean of within groups. It guaranties that the clusters have most similarities within groups and least similarities in among groups. Four different techniques of the mathematical clustering are compared with the result of the pro- posed GA by using the MATLAB software. The related results show that the clustering of weighing sta- tions is more likely to other methods. ? 2011 Elsevier Ltd. All rights reserved. 1. Introduction Drivers and transportation companies in Iran are not allowed to load their commercial vehicles more than the authorized weights, which have been determined in the Iranian transport regulations. The weight of commercial vehicles is randomly checked in weigh- ing stations via portable or fi xed scales. If the observed weight exceeds from the authorized weight, drivers or transportation companies should reduce the weight of their own vehicles, and pay the fi ne and the cost of road surface damage determined by the Iranian Ministry of Road and Transportation. One of the main concerns in rural road transportation is the performance of weigh- ing stations. The managers of public sectors try to achieve an appropriate model in order to evaluate and rank the weighing stations by the specifi c factors illustrating the performance of stations. Based on published transport statistics (RMTO, 2009), it is esti- mated that over 512 million tons of goods including high portion of fright transportation is moved via roads in Iran. Overloading (i.e., carrying more than the authorized capacity) is one of the most important issues involved in supervision of road transport is due to the risk for safety and irreparable damage on roads and con- structions. Therefore, to promote safety and prevent serious dam- age to infrastructures, laws and regulations upon determining the maximum permitted vehicle weight in the country have been ap- proved. Weighing control stations are located around the country to stop overloaded vehicles and cause their drivers to reduce the amount of loading and meet regulation. Evaluating their perfor- mances is more vital to encourage detection of overloading. Thus in this paper, a genetic algorithm is developed to classify the per- formance of weighing stations in Iran based on well-known statis- tical technique of clustering. A cluster analysis and meta-heuristic approaches for clustering problems are commonly visible in scientifi c documents. Genetic algorithm was developed (Cheng fax: +98 21 88013102. E-mail address: tavakoliut.ac.ir (R. Tavakkoli-Moghaddam). Expert Systems with Applications 38 (2011) 1174411750 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: /locate/eswa acteristics. Zhang, Ouyang, and Ning (2010) proposed an artifi cial bee colony approach for clustering data and comparing with the other popular heuristic algorithms, computational simulations re- veal improvement in terms of the solution quality and the process- ing time. A hybrid effi cient method has been presented (Panahi however, this method is frequently used for classifi cation and ranking. The GA is also used for land-cover classifi cation (Tseng, Chen, Hwang, however, the better solution is observations 1, 2 and 3 in the fi rst cluster defi ned by (5, 8) and observations 4 and 5 in the second cluster defi ned by (8, 10). If there is a method considering the entire data in the process of clus- tering, the losing data will have no rule on results. One of the well- known ways may be the random selection, and GA will help the random selection methodology to give a best solution in an appro- priate running time. In this paper, a GA-based model for classifi cation of weighing stations performance is developed regarding to the cluster analy- sis. The sum of square errors is considered as a fi tness value of the desired genetic algorithm model. The fl eet traffi c volume, rate of detected overload, type of scale used in weighing stations, and increasing rate of discovered overload regarding to the same period in the previous year are the main parameters to the perfor- mance measure of weighing stations. After analyzing the infl uen- tial factors, all weighing stations are categorized based on the cluster analysis. The model validation is carried out by comparing the sum of square errors in a GA model and the other types of mathematical clustering methods using the same data. The rest of this paper is as follows. Section 2 includes a brief discussion of the GA and clustering as well as the term of data standardizing. Sections 3 introduces the infl uential factors in performances of weighing stations. Section 4 proposes the GA-based model. Finally, conclusion and suggestions for future studies are considered in Section 5. 2. Literature review This section briefl y studies the principles of scientifi c explana- tions including genetic algorithms, clustering method, and stan- dardization of data. 2.1. Genetic algorithm A genetic algorithm (GA) is one of the widely used methods in optimization problems (Gen however, the square Euclidean distance relationship for the close similarity criteria is computed by: P2 ij X p k1 xik? xjk21 where, xikand xjkare the amount of two observations i and j in the kth factor and p is the number of factors (Sharma, 1996). One of the common methods in this fi eld is a hierarchical clus- tering method, in which the nearest approach to successive obser- vations is merged with more similarity formed. Several methods for merging observations, including the average neighbor, farthest neighbor, nearest neighbor, average distance method, and method of the total deviation of variables and mean square deviation are used (Sharma, 1996). 2.3. Determining the number of clusters In general, the number of clusters is determined based on the two methods (Sharma, 1996). In the fi rst method, the number of desired clusters is determined by experts or decision makers. This method is often used when the number of clusters is determined in the other decision-making process, such as in a fuzzy model. An- other way to determine the number of clusters is to fi nd the min- imum deviation from the mean square within clusters, called within the sum of square errors. In this method, the total deviation is calculated based on the total number of observations. The ratio of the total deviation remained as within sum of square errors is 11746A. Mahmoudabadi, R. Tavakkoli-Moghaddam/Expert Systems with Applications 38 (2011) 1174411750 considered as criteria and the number of clusters is determined based on the portion of remained errors (Sharma, 1996). 2.4. Data standardization In some cases, desired parameters in decision models do not have the same dimensions. Therefore, data should be converted to a homogenous dimension by the standardization process (Pach- eco however, in running GA they are different. 4.4. Fitness value function Fitness values of chromosomes are used to compare solutions (Gen & Cheng, 1997). The total sum of square within groups is con- sidered as a fi tness value. The best solution has the maximum amount of the total sum of square within groups. The fi tness value is calculated by extracting the total sum of square error of within groups from the maximum square errors of observations. When the number of observations is 126 and number of variables is 6, the total sum of square errors of standardized data is calculated by Sharma (1996): Maximum amount of errors 126 ? 1 ? 6 750 To calculate the sum of square within group, the deviations be- tween observation and mean of variables are considered. The cal- culation process is done for all chromosomes in each population. 4.5. Crossover operator The crossover operator includes the following stages: ? Determining two random chromosomes (i.e., feasible solutions), in which one of them is chosen from the best solutions. ? Determining the number of chromosome cells by random numbers. ? Determine the alleles that must be substituted. ? Substituting the alleles if feasibility of offspring is met. ? Adding feasible offspring to the population. This process continues based on the crossover rating. To illus- trate the crossover operation, assume two chromosomes as shown in Fig. 7. The fi rst chromosome is chosen among the best solutions that are set in the 20% of the top chromosome of the population. It means that if the population size is 40, the fi rst chromosome is se- lected among 18 when the population is sorted descending based on the fi tness values. This process ensures that at least one of par- ents is chosen among the best solutions. In the next stage, substitution of alleles is done based on the number of cells identifi ed by random numbers. If the feasibility check is false, offspring will not be generated. To illustrate the sub- stitution process, consider Fig. 8 and assume that two alleles in the fi rst chromosome should be replaced by two alleles in the second chromosome. Alleles in cells 1 and 5 should be interchanged with alleles in cells 88 and 122, respectively. Fig. 9 shows offspring after the crossover operation. 4.6. Mutation operator The mutation operator consists of the following stages: ? Determining the number of chromosomes based on the muta- tion rate. ? Determining the number of alleles should be changed. ? Changing alleles with random numbers between 1 and the number of clusters if new solution remains feasible. 1 2 3 4 5 85 86 87 87 122 123 124 125 126 5 6 7 8 9 1 2 3 4 7 8 9 10 11 12 13 14 15 16 8 9 6 5 14 15 16 17 18 9 10 11 12 13 5 6 6 5 11 12 13 14 15 20 1 2 3 4 16 17 18 19 2 3 4 5 6 18 19 20 1 2 14 15 16 17 20 1 2 3 4 Fig. 6. General view of a GA initial population. 1 2 3 4 5 85 86 87 88 122 123 124 125 126 5 12 8 12 14 1 19 2 6 7 9 14 3 12 1 2 3 4 5 85 86 87 88 122 123 124 125 126 3 9 11 7 18 10 12 17 3 1 16 15 2 4 Fig. 7. Selected chromosomes for the crossover operator. 1 2 3 4 5 85 86 87 88 122 123 124 125 126 5 12 8 12 14 1 19 2 6 7 9 14 3 12 1 2 3 4 5 85 86 87 88 122 123 124 125 126 3 9 11 7 18 10 12 17 3 1 16 15 2 4 Fig. 8. Selected alleles for substitution in the crossover operator. 1 2 3 4 5 85 86 87 88 122 123 124 125 126 3 12 8 12 1 1 19 2 6 7 9 14 3 12 1 2 3 4 5 85 86 87 88 122 123 124 125 126 3 9 11 7 18 10 12 17 5 14 16 15 2 4 Fig. 9. Offspring after the crossover operator. 11748A. Mahmoudabadi, R. Tavakkoli-Moghaddam/Expert Systems with Applications 38 (2011) 1174411750 ? Adding offspring to the population. Fig. 10 illustrates the process of the mutation operator when the allele of cell 65 is changed from 14 to 7. It means that the observa- tion 65, which belongs to the 14th cluster, moves into the 7th cluster. 4.7. Selection process The selection process tries to select the best chromosomes from the current (or old) generation to the next generation. The fi tness values of chromosomes are used as criteria in the selection process. Chromosomes after applying crossover and mutation operators and calculating fi tness values are sorted in a descending order of their fi tness values, and some of the high values equal to the pop- ulation size will be transfer to the next generation. 4.8. Running the proposed GA The well-known MATLAB software is used to program the GA by creating an m fi le. This algorithm is run for different clusters and iterations. Table 2 shows the related results for cluster numbers of 15 and 30, population sizes of 20, 30, 40, and 50, and iterations of 100, 250, and 500. The within and between sum of square errors are the average amount of fi ve times of running the process, as illustrated in Table 2. In this table, there is no signifi cant difference between 250 and 500 iterations, so 250 iterations is good enough to run the proposed GA. When the population size and the number of clusters increase, better solutions will be outlined. If the popula- tion size becomes large, better solutions will be outlined although the number of iterations is small. 4.9. Validation of the proposed GA To validate the proposed GA, the related outputs are compared with the common mathematical solutions in clustering. In this pa- per, we compare this algorithm with four known methods in clus- tering, namely the nearest neighbor method, farthest neighbor method, average method, and Ward method for merging data. The within group sum of square errors are shown in Table 3. Com- paring the results shows that the GA developed in this paper is suitable for clustering the performance of weighing stations. 5. Conclusion This paper has developed a genetic algorithm (GA) for cluster- ing the performance of weighing stations in an Iranian case study. Available data for six parameters in 126 stations have been ana- lyzed, and stations have been clustered using the developed GA - based model. To solve the problem of existing different dimensions of parameters, the collected data have been standardized. The pro- posed GA has been analyzed in the different number of clusters, iterations, and population size, and the associated results have been discussed. Validating has been done based on the comparison between fi tness values of the algorithm and the same criteria in mathematical clustering known as between the sum of the square errors. Future studies can focus on more parameters to evaluate the weighing station performance. Furthermore, the proposed GA 1 2 3 4 5 65 66 67 68 122 123 124 125 126 2 17 5 19 20 14 6 3 8 7 11 14 3 15 1 2 3 4 5 65 66 67 68 122 123 124 125 126 2 17 5 19 20 7 6 3 8 7 11 14 3 15 Fig. 10. Mutation process. Table 2 Between and within sum of square errors. ClusterPopulation sizeIterationWithin SSEBetween SSEClusterPopulation sizeIterationWithin SSEBetween SSE 15201005891613020100427323 250577173250390360 500568182500405345 3010057517530100423327 250552198250390360 500555195500406344 4010057417640100428322 250556194250398352 500547203500397353 5010058616450100430320 250553197250390360 500548202500387363 SSE = sum of square errors. Table 3 Within group sum of square errors in different ways of clustering. Number of clustersNearest neighborFarthest neighborAverageWardDeveloped GA 15522556550589547 20444521508521491 25394487456499436 30356457373468387 A. Mahmoudabadi, R. Tavakkoli-Moghaddam/Expert Systems with Applications 38 (2011) 117441175011749 can be used for clustering stations in the other enforcement activ- ities, such as over speeding and overtaking. References Ahmed, S., & Kanhere, S. S. (2007). Cluster-based forwarding in delay tolerant public transport networks. In Proceedings of the fi rst IEEE international workshop on user mobility and vehicular networks (ON-MOVE 2007) in conjunction with IEEE LCN 2007, Dublin, Ireland, October 2007 (pp. 625634). Benoit, D., Greet, W., & Keen, V. (2008). Traffi c accident segmentation by means of latent class clustering. Accident Analysis and Prevention, 40(4), 12571266. Bueno, R., Traina, A. J. M., & Traina, C. Jr., (2007). Genetic algorithms for approximate similarity queries. Data and Knowledge Engineering, 62, 459482. Cheng, C. B., & Wang, K. P. (2009). Solving a vehicle routing problem with time windows by a decomposition technique and a genetic algorithm. Expert Systems with Applications, 36(4), 77587763. Dai, D., & Oyana, T. J. (2006). An improved genetic algorithm for spatial clustering. In Proceedings of the 18th IEEE international conference on tools with artifi cial intelligence (ICTAI06), Arlington, Virginia (pp. 371380). Elmzabi, A., Bellafkih, M., & Ramdani, M. (2007). An adaptive fuzzy clustering approach for the network management. International Journal of Information Technology, 3(1), 1217. Gen, M., & Cheng, R. (1997). Genetic algorithms and engineering design. John Wiley & Sons. Gordon, J., Fielding, M., Brenner, E., & Faust, K. (1985). Typology for bus transit. Transportation Research Part A, 40(4), 12571266. Mahdavi, I., Cho, N., Shirazi, B., & Sahebjamnia, N. (2008). Designing evolving user profi le in e-CRM with dynamic clustering of Web documents. Data and Knowledge Engineering, 65, 355372. Marandi, S. M., Safapour, P., Baqeripour, M. H., & Ghasemi, M. (2007). Fuzzy-neural network and cost optimization using genetic algorithm in modelling Mar
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 矿产勘查中的非常规油气资源评价考核试卷
- 胶合板在智能家居市场的应用考核试卷
- 市场营销战略与实践考试考核试卷
- 矿山机械设备更新与投资决策考核试卷
- 租赁机械的节能减排技术考核试卷
- 节能建筑能耗模拟与优化施工考核试卷
- 员工持股计划信托股权激励合同
- 工业级烧碱(NaOH)绿色供应链管理合作协议
- 互联网平台数据隐私保护与服务协议
- 物流园区节能减排规划设计与实施合同
- 新北师大版八年级下册数学教案+教学计划大全
- 量子通信平台下的宇宙观测-全面剖析
- 2025-2030中国生物质能发电行业市场现状供需分析及投资评估规划分析研究报告
- 固体废物运输合同协议
- SL631水利水电工程单元工程施工质量验收标准第1部分:土石方工程
- (正式版)HGT 22820-2024 化工安全仪表系统工程设计规范
- 突发公共卫生事件流行病学-课件
- 利巴韦林注射液生产工艺验证方案
- 高中音乐 鉴赏 第五单元《诗乐相彰》第九节 独唱曲 课件
- 恒强文字多纱嘴组设定
- 外科护理学练习题库判断题及答案
评论
0/150
提交评论