马国庆-面向移动浏览器网页预取与缓存方法设计实现_第1页
马国庆-面向移动浏览器网页预取与缓存方法设计实现_第2页
马国庆-面向移动浏览器网页预取与缓存方法设计实现_第3页
马国庆-面向移动浏览器网页预取与缓存方法设计实现_第4页
马国庆-面向移动浏览器网页预取与缓存方法设计实现_第5页
免费预览已结束,剩余23页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

口袋网络:移动设备的即时网络浏DimitriosLymberopoulos,OrianaRiva,KarinStrauss,AkshayAlexandros高的网络延迟和电池使用限制使得的网络浏览体验不尽如人意。在之前的工作8000个用户的网络浏览路径开始进行分析,通Boosting技术的机器学习方法高效的训练出反映每个用减少了超过50%的能量消耗。分类和描述H.4.m[信息系统]:信息系统的应用-杂一般条款1.随着最近触摸屏和增加和移动数据网络的普及,智能正在快速的普及。它们是接入网络最便捷的设备,据[22]4年内超过桌面设备成通过搜索服务来验证此想法:一些被经常查询的词条以及对应的结果在晚上正在充,为保证在现有的网络环境下获得快速的浏览体验,本文章提出一种智能的网页层次上来说,不管是因为预取发生的太久(内容过期)还是预取的内容不是用户要之前关于网络内容预取主要关注与简单的将各个联系到一起组成集合,然后当入的研究浏览的特点而不仅仅是浏览记录的连续特征。我们利用根据分析得出的浏将要什么页面并且能够得到用户什么时候会相应的页面。通过分析每个用户的浏览记录,能够在用户准备某个页面之前提前好相应的内容,从而提高用48000个模型进行评价,结果显示,我们能够少了超过50%的能量消耗。提供了一个对8000用户网络浏览的详细分析,结果显示每个用户的浏览行为在验证上述方法的准确性时,用户的浏览记录一部分用来训练用户的2.网络浏览分我们整个工作的就是理解用户的浏览行为并且建立有效的模型。我们首先对我我们所用的数据集是8000个用户超过3个月的浏览记录。其中,用户是随机的从大量安装了Bing应用或者中预装了Bing的用户中选取的。用户的手([20,4040,140140,460460,∞])又被细分为四个类(低、中、高、极高。在我们分析的浏览记录中,包括了每个用户的标识,的URL地址以及的时间。移动网络的重复我们首先研究移动网络的重复性。我们计算用户在下一次中会一个全URL1.a所示,不同的设备和使用频率的用户分开作图。结果显示,将近40%到60%的智能用户,不管是使用的频率高还是低,只有近20%的可能们还发现,用户使用的次数越多,越有可能重复的页面。另外,功能机上的特点与浏览路径更加的保守,他们往往只会一些需要的页面。图1网页统的URL占到了用户总的50%以上。换句话说,单个URL对每个用户的URL问量的10%,因此在预取技术中考虑每个用户的特点显得非常重要。少数的用户经常的URL;第二种情况就是大量的用户不经常的URL。为了进比如说是3,那么可能导致那种50%经常使用浏览的用户的被标记的URL数目会使用浏览的用户的被标记的URL数目为0.图2最长页面统被标记的记录占到了70%以上。2URL的平均数URL2户被标记的页面、未被标记的页面以及任意页面的时间间隔分布。4中不同的126分钟。因此,可以将被标记的页面当做是下个被标记的页面被的。图3频未被标记的页面到下次再次未被标记的页面,70%到80%用户的时间间隔小于12分钟,远高于被标记页面的35%到50%的比例。换句话说,当用户使用网络的时候,与被标记的页面相比,他们更倾向于在短时间内多个页面。因此在除了分析浏览过程中的相对时间,我们同样研究了浏览记录中的绝对时间(某图4所示,我们随机选取了4个智能用户3个月的浏览记录,将用户一天内的图4时间分由图可知,4个用户的网络行为模式的差异是显而易见的。用户1和用户4的记录中主要是未标记的所以对于这类用户的模式进行内容预取存在很大的难度。然而,有趣的是,用户和用户4的具有很强的时间周期性(用户4的行为仅发生在上午6点到9点用户标记页面的周期性预测出某个被标记的页面什么时候会被用户,那么就能8000个用户的真实的浏览记录,我们发现了对于进行预取非常关键少部分被标记的URL对用户的浏览行为起着主要作用,预测这些被标记的URL对被标记的URL总是成群的被。使用这倾向于在很短的时间窗格里成批的它们,因此,当前被的被标记的URL对以后的用户行为有很大的指导URLURLURL记录能够帮助我们预测接下来的URL的类型基于学习的内容预每个呈现给用户的URL,需要计算很多相应的特征,对用户的信息,检索信息,URL以为网页预测模型,模型计算在给定的时间每个被标记URL的概率。相同的,概率越大说明对应的URL更有可能在接下来被用户。被用来训练预测模型的特征是整个模型建立的,并且应当从用户的浏览记录中抽取。55URL的特征,这些特征依据时间和空间的特性来对用释为已或未的特征向量。利用这些带的特征向量路径,系统里有随机梯度Boosting技术训练一个模型,URL都会生成一个特征向量,预测模型将这些特征向量作为阈值,判断对哪些URL进行预取,包括对应的、CSS、javascript等。的网络浏览结构中获得。利用随机梯度Boosting的方法,我们可以更加清晰的发现为了创建一个预测模型,MART需要将历史的浏览记录数据作为输入,整个数据被1/51/6.训练集用来训练模型,它们可能与一个空间参数s(之前的URL一个时间参数t(时间一个时空参数b(距上次被的时间,一个活跃程度参数p(该URL在所有URL中的活跃度,连同一个记录用户行为的a相关联。MART利用训练数据构建一个分类模型M,该模型用来评价转移的概率𝑃𝑀(𝑎|𝑠𝑡𝑏,𝑝)。在测试时我们利用下降梯度进行优化,表1提出了4个特征值,如表1中所示,用来反映上述的特性。总的来说,对于一个有k个被标记URL的用户而言,需2*k+11个特征值。每个特征值的具体介绍见表征记录用户被标记URL的周期性以及页面的具体时间。时空特征将空间和时URL110不等,但是起到决定因素的页面通常只有23个。试数据。我们首先利用训练数据找出用户的被标记的URL,对于在一个月内被过5次URL分别命名为𝑡1,𝑡2,⋯,𝑡𝑘6中的用户而言,k=2.鉴定完被标记的URL之后,训练集、验证集,测试集将通过如下处理得到。图6单分成一系列的单元。举例来说,图6中显示了当D=5时4个不同的单元。对于被的被标记的URL对应的特征向量被标记为已。其他的特征向量被标记为。如果一个被标记的URL在一个单元内被了多次,那么在计算的时候只考虑单元内的第一次。这使得模型向第一次偏移,从而保证预取的及时性。每当一个URL在一个单元开始的时候被预取后,在整个单元内都认为该页面的内容是的。在这里面,我们默认为网页的刷新时间要长于D。训练文件中带的特征向量被用来下训练预测模型。测试文件中带的特我们能从模型中得到每个页面的概率。对于概率大于0.5的页面进行预取(这个率的阈值可以根据网络的状况和电池的状况进行动态的变化。预测的成功与否取决于4.6个不同的模型(2个不同的特征集合三个不同的新鲜阈值,对于所有的用户,我们需要创建48000个预测模型。5.利用剩余的空间换取网络延迟和电量消耗。先前的工作我们只是利用空间通过对用户将要的内容进行预测,提高用户在使用时的浏览体验,同时减低能耗。参考E.Adar,J.Teevan,andS.T.Dumais.Largescale ysisofwebrevisitationpatterns.InProc.ofCHI,2008.E.Adar,J.Teevan,andS.T.Dumais.Resonanceontheweb:webdynamicsandrevisitationpatterns.InProc.ofCHI,pages1381–1390,2009.A.Adya,P.Bahl,andL.Qiu.yzingthebrowsepatternsofmobileclients.InProc.MWorkshoponInternetT.Armstrong,O.Trescases,C.Amza,andE.deLara.Efficientandtransparentdynamiccontentupdatesformobileclients.InProc.ofMobiSys,2006.A.Balasubramanian,B.Levine,andA.Venkataramani.Enhancinginteractivewebapplicationsinhybridnetworks.InProc.of N.Balasubramanian,A.Balasubramanian,andA.Venkataramani.Energyconsumptioninmobilephones:ameasurementstudyandimplicationsfornetworkapplications.InProc.ofIMC,pages280–293,2009.P.Barford,A.Bestavros,A.D.Bradley,andM.Crovella.Changesinwebclientpatterns:Characteristicsandcachingimplications.WorldWideWeb,2(1–2):15–28,L.D.CatledgeandJ.E.Pitkow.Characterizingbrowsingstrategiesintheworld-wideweb.InProc.ofthe3rdWorld-WideWebconferenceonTechnology,toolsandapplications,pages1065–1073,1995.A.Cockburn,S.Greenberg,S.Jones,B.Mckenzie,andM.Moyle.Improvingwrevisitation:ysis,designandevaluation.ITandSocietyJ.,1:159–183,A.CockburnandB.McKensie.Whatdowebusersdo?anempiricalysisofwebuse.Int.J. put.Stud.,54:903–922,June2001.C.Cunha,A.Bestavros,andM.Crovella.Characteristicsofwwwclient-basedTechnicalReportTR-95-010,andSystems,1999.H.Falaki,D.Lymberopoulos,R.Mahajan,S.Kandula,andD.Estrin.Afirstlookatonsmartphones.InProc.ofIMC,pagesJ.H.Friedman.Stochasticgradientboosting.Comput.Stat.Data.,38(4):367–378,ITRSWorkingGroup.Internationaltechnologyforsemiconductors2009report.Technicalreport,2009.Z.JiangandL.Kleinrock.Webprefetchinginamobileenvironment.IEEE Communications,5(5),1998.F.Khalil,J.Li,andH.Wang.Integrating mendationmodelsforimprovedwgepredictionaccuracy.InAustralasianConference puterScience,2008.A.KomninosandM.Dunlop.Acalendarbasedinternetcontentpre-cachingagentforsmallcomputingdevices.J.of alandUbiquitousComputing,12(7),2008.E.Koukoumidis,D.Lymberopoulos,K.Strauss,J.Liu,andD.Burger.Pocketcloudlets.InProc.ofASPLOS,2011.E.P.MarkatosandC.E.Chronaki.Atop-10approachtoprefetchingontheweb.InProc.ofINET,1998.B.McKenzieandA.Cockburn.Anempiricalysisofwgerevisitation.InProc.ofHICSS,volume5,2001.MongooseMetrics.MobileDevicesSurpassDesktopWebBrowsinginFivetoTenYears,2010./press A.Nanopoulos,D.Katsaros,andY.Manolopoulos.Adataminingalgorithmforgeneralizedwebprefetching.IEEETrans.onKnowledgeandDataEngineering,2003.H.Obendorf,H.Weinreich,E.Herder,andM.Mayer.Wgerevisitationrevisited:Implicationsofalong-termclick-streamstudyofbrowserusage.InProc.ofCHI,2007.V.N.PadmanabhanandJ.C.Mogul.Usingpredictiveprefetchingtoimproveworldwideweblatency. MComput.Commun.Rev.,26(3),1996.V.N.PadmanabhanandL.Qiu.Thecontentandaccessdynamicsofabusywebsite:findingsandimplications. MComput.Commun.Rev.,30:111–123,2000.J.PitkowandP.Pirolli.Mininglongestrepeatingsubsequencestopredictworldwidesurfing.InProc.ofUSENIX,pages139–150,F.Qian,Z.Wang,A.Gerber,Z.M.Mao,S.Sen,andO.Spatscheck.Characterizingradioresourceallocationfor3gnetworks.InProc.OfIMC,pages137–150,2010.L.TauscherandS.Greenberg.Howpeoplerevisitwges:empiricalfindingsimplicationsforthedesignofhistorysystems.Int. p.St.,47:97–137,A.Thawani,S.Gopalan,V.Sridhar,andK.Ramamritham.Contextawaretimelyinformationdeliveryinmobileenvironments.TheComputerJournal,50(4),2007.ofwebcontenttoamobiledevice.InProc.OfMobility,2007.Q.Wu,C.J.C.Burges,K.M.Svore,andJ.Gao.Ranking,boosting,andmodeladaptation.TechnicalReportMMSR-TR-2008-109,Research,2008.L.YinandG.Cao.Adaptivepower-awareprefetchinwirelessnetworks.IEEETrans.onWirelessCommunications,3(5),2004.PocketWeb:InstantWebBrowsingforMobileDimitriosLymberopoulos,OrianaRiva,KarinStrauss,AkshayMittal‡,Alexandros nInstituteofTechnology,Kanpur, Thehighnetworklatenciesandlimitedbatterylifeofmobilephonescanmakemobilewebbrowsingafrustratingexperience.Inpriorwork,weproposedtradingmemorycapacityforlowerwebaccesslatencyandamoreconvenientdatatransferschedulefromanenergybyprefetchingslowly-changingdata(searchqueriesandresults)nightly,whenthephoneischarging.However,mostwebcontentisintrinsicallymuoredynamicandmaybeupdatedmultipletimesaday,thuseliminatingtheeffectivenessofperiodicupdates.Thispaperaddressesthechallengeofprefetchingdynamicwebcontentinatimelyfashion,givingtheuseraninstantwebbrows-ingexperiencebutwithoutaggravatingthebatterylifetimeissue.Westartbyyzingthewebaccesstracesof8,000users,andobservethatmobilewebbrowsingexhibitsastrongspatiotemporalsignature,whichisdifferentforeveryuser.Weproposetouseama-niquestoefficientlymodelthissignatureonaperuserbasis.Themachinelearningmodeliscapableofaccuraypredictingfuturewebaccessesandprefetchingthecontentinatimelymanner.Ourdatasetsshowsthatwecanaccurayprefetch60%oftheURLsforabout80-90%oftheuserswithin2minutesbeforetherequest.Thesystemprototypewebuiltnotonlyprovidesmorethan80%lowerwebaccesstimeformorethan80%oftheusers,butitalsoachievesthesameorlowerradioenergydissipationbymorethan50%forthemajorityofmobileusers.CategoriesandSubjectDescriptorsH.4.m[InformationSys-tems]:InformationSystemsApplications—MiscellaneousGeneralTermsAlgorithms,HumanFactors,Withrecentadvancesinlargetouchscreensandwidespreaddatanetworks,smartphonesarerapidlygainingpopularity.Theyarethemostconvenientdevicetoaccesstheweb,andaccordingtoarecentstudy[22],mobiledevicesareexpectedtosurpassdesktopwebbrowsinginthenext4years.Mobilephoneuser’sexperiencehascomealongwayinthepastdecade,butsuchdevicesstillfacehighnetworklatenciesandlimitedbatterylife,whichcanmakethemobileexperiencefrustrating.Permissiontomakedigitalorhardcopiesofallorpartofthisworkforalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthefirstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/orafee.ASPLOS’12,March3–7,2012,London,England,UK.Copyrightc2012ACM978-1-4503-0759-8/12/03...$10.00

Luckily,memorycapacityisstillexperiencinghealthyimprove-ments[15],andcanbeusedtomitigatethetwopreviouslimita-tions.Surplusmemorycanbeusedtostoredatathatisbroughttothemobiledevicewhennetworkconditionsarefavorable.TheBasedonthisobservationwehaveproposedtheconceptofPocketCloudlets[19],i.e.,bringingpartofcloudservicesintomobilephonestoreducelatencyandenergyconsumption,withtheaddedbenefitofsignificantlyreducingtheloadontheserversideaswell.Wedemonstratedtheconceptusingasearchservice:asetofpop-ularsearchqueriesandresultsisloadedontothephoneatnight,whenthephoneischarging,tospeedupsearchesduringthenextday.However,inthatworkwelimitedourselvestosearchanddidnotaddressactualwebcontent.Whilesearchresultschangeslowly(theycanbeconsideredstaticonadailybasis),webcontentcanchangequicklyduringasingleday.Toenableafastermobilewebbrowsingexperienceontheinligentwebcontentprefetcherthatdownloadswebcontentonthemobiledeviceatappropriatetimes,anticipatingauser’sfuturewebaccesses.Perhapsthebiggestchallengewithprefetchingwebcontentcomparedtosearchqueriesandresultsisthatwebcontentisdynamic.Whilethemapofsearchqueriestosearchresultsremainsrelativelystableoverdaysorevenweeks,webcontentchangesfrequently.Newsandsocialnetworkwebsiteschangecontinuously,suchthatthenightlyupdateapproachdoesnotworkaswellasitdoesforsearch.prefetchasmuchcontentwecanasoftenaspossible.However,thisapproachisnotpracticalduetoconstraintsinbatterycapacityandwgesofauserevery2minutesmightensurealightning-fastwebbrowsingexperience,butthebatterywouldnotlastverylong.Atahighlevel,wecanconsideraprefetchofawgeasunsuccessfulor“wasted”eitherbecauseithappenedtoolongago(andthusthecontentonthedeviceisstale),orbecausetheuserdidnotendupexplicitlyrequestingthewgeatall.Ineitherofthesecases,weendupusingsomephone’sresourceswithoutrealizinganygains.Therefore,ourgoalistoachievetimelyprefetcheswithoutincreasing(andpossiblydecreasing!)energyWefirstyzethewebaccesstracesfromroughly8,000mobileusersoveraperiodof3monthsandshowthreewell-definedpatterns.First,usersoftenvisitasmallsetofwgesfromtheirphones,whichtheytendtorepeatedlyvisitovertime.Second,useraccessesareoftenperiodicandhappenatgiventimewindows.Forexample,ausermaychecktheupdatesevery30anafternoonbreak.Third,usersoftenaccesscontentinbursts.Forexample,whentheuserchecksthenews,shemayalsochecktheweatherforecastandcurrentstockprices.low

medium

lowFeaturephoneStdSmartphoneAverageSmartphoneStd

mediumFeaturephoneStdSmartphoneAverageSmartphoneStdFractionFractionof0 Probabilityofvisitinganewhigh

FractionFractionof0 Probabilityofvisitinganewextreme

FractionFractionofURL0 #ofthetopmostfrequentlyvisitedhighFeaturephoneStdSmartphoneAverageSmartphoneStd

FractionFractionofURL0 #ofthetopmostfrequentlyvisitedextremeFeaturephoneStdSmartphoneAverageSmartphoneStdFractionFractionof0

FractionFractionof0

FractionFractionofURL0

FractionFractionofURL0

Probabilityofvisitinganew

Probabilityofvisitinganew

#ofthetopmostfrequentlyvisited

#ofthetopmostfrequentlyvisited Figure1.(a)RepeatabilityofmobileURLvisits.(b)AverageandstandarddeviationofthecumulativeURLvolumethatthetopmostfrequentlyvisitedURLsareaccountablefor.Priorworkinwebcontentprefetchinghasfocusedonsimplycorrelatingsetsofwebsitesaccessedtogetherandusingthisinfor-mationtoprefetchthesetwhenthefirstpageofasetisaccessed.Inthiswork,westudypropertiesofmobilewebaccessesthatgobe-yondthesesequentialfeatures.Wetakeadvantageofthespatiotem-poralaccesspatternsobtainedfromourysisandusemachinelearningtechniquestocreateamodelthatcanbeusedtopredictbothwhatwgesauserislikelytorequestaswellaswhentheserequestsarelikelytooccur.Bylearninghoweachindividualuseraccessesthewebovertime,thephonecanproactivelydown-loadwebcontentbeforetheuserexplicitlyattemptstoaccessit,thusenablinganinstantmobilebrowsingexperience.wecanaccurayprefetch60%oftheURLsforabout80-90%oftheuserswithin2minutesbeforetherequest.Furthermore,theproposedapproachnotonlyprovidesmorethan80%lowerwebaccesstimeformorethan80%oftheusers,butitalsoachievesthesameorlowerradioenergydissipationbymorethan50%forthemajorityofmobileusers.Insummary,thispapermakesthefollowingfrom8,000users,showingwidelydisparatebehaviorsfromusertouserbutastrongspatiotemporalstructureforindividualaccesspredictionprobleminmachinelearning,wherethefea-turesarederivedfromtheobservablestructureofmobilewebbrowsing.Weusestochasticgradientboostingtechniquesforthispurpose,whichallowustoprovideinsightintowhichfea-turesarethemostrelevanttoaccessprediction.Experimentallyevaluatestheaccuracyoftheproposedap-proachforeachuserbycreatingindividualusermodelswithaportionoftheaccessesineachtrace,andtestingtheirper-formancewiththeremainingportion.fies,usingapro-totypeimplementation,theimpactoftheproposedapproachonthewebaccesstimeandradiopowerconsumptionwithrespecttothestate-of-the-art.

MobilewebBrowsingCentraltoourworkistheabilitytounderstandandmodeleffec-tivelytheuserbrowsingbehavior.Westartourstudybyfirstpro-vidingadescriptionofthedatasetthatweused,andcontinuewiththeresultsofourysisbothonaggregateacrossusersandindi-viduallyperuser.DataWeusedthemobilewebaccesslogsof8,000usersacrosstheUnitedStatesovera3-monthperiod.TheuserswererandomlyselectedamongalargernumberofusersthatoptedtodownloadandinstalltheBingapplicationortoenablethepre-installedBingtoolbarontheirmobilephones.Users’phonesvariedfromhigh-endsmartphones(e.g.,,Android,Blackberry)tolow-endfeaturephones(e.g.,LGandSamsungdeviceswithcustomoperat-ingsystems).Fromthetotalof8,000usersinourdataset,halfareunderstandingonuserbehavior,eachofthetwosetsofuserswasfurthersplitinto4differentclasses(low,medium,high,andex-tremevolumeclasses)basedonthemonthlyvolumeofwebac-cesses([20-40),[40,140),[140,460),[460,∞)respectively).Inthelogsweyzed,theinformationoneachwebaccessincludeduniqueuseridentifier,fullpathofaccessedURL,andaccesstimes-RepeatabilityofMobileWebWefirststudytherepeatabilityofmobilewebaccesses.Wecom-putethenumberoftimesthatanyuserwillbevisitinganewuniqueURL(i.e.,afullpathURLthathasnotbeenvisitedbefore)inthenextaccess.WeshowtheresultsinFigure1(a)acrossvolumeclassesanddevicetypes.Approximay40%to60%ofthesmart-phoneusers,forthelowandextremevolumeclasses,arelikelytovisitanewURL20%ofthetime.Inotherwords,80%oftheURLvisitsarerepeatedvisitsforroughlyhalfofthesmartphoneusers.Wealsoobservethatusersinhighervolumeclassesaremorelikelytorepeatvisitsthanusersinlowervolumeclasses.Finally,althoughthetrendsaresimilarforfeaturephones,theoverallrepeatedvisitsarehigherwhencomparedtosmartphoneusers.Intuitively,fea-turephoneusersthathavetointeractwithdeviceswithconstrained 148104484 %ofTotalURL%ofTotalURLFractionofURL

low

FractionofURL0

0DeviceTypeUserVolume

FractionFractionofURL

InterAcessTime(hours)highvolume

FractionFractionofURL

InterAcessTime(hours)extremevolumeFigure2.BreakdownoftotalURLaccessesintotargeted(URLsthathavebeenaccessedatleast5timesinamonth)anduntargeted.Thewhitenumbersineachbarplotrepresenttheaveragenumberofuniquetargeted/untargetedURLs.

00

InterAcessTime

00

InterAcessTimeuserinterfacesandhardwarecapabilitiesaremorelikelytoaccessthewebinamoreconservativewaycomparedtosmartphoneusers.Theytendtoexploretheweblessandfocusmoreonaccessingwebcontenttheyreallyneedtoaccess.ToexaminetherepeatabilityofURLvisitsinmoredetail,wealsocompute,foreachuser,thecumulativeURLvolumeforthetopURLsthattheuseraccesses.TheresultisshowninFigure1(b).ThenumbersonthehorizontalaxisrepresentthetopmostfrequentlyvisitedURLs(theseURLsmightbedifferentacrossusers).TheverticalaxisshowsthecumulativefractionofthetotalURLvisitsthatthenumberofthemostfrequentlyvisitedURLsisresponsibleforacrossusers.Notably,acrossuserclasses,themostfrequentlyvisitedURLaccountsforabout50%oftheoveralluser’sURLvisits.Inotherwords,asingleURLisresponsibleforapproximayhalfofatypicaluser’sURLrequests.However,wespecifically,thereareusersforwhommorethan90%oftheirtotalURLvolumecanbeattributedtoasingleURL,andusersforwhichthemostfrequentURLcorrespondstolessthan10%oftheirtotalvolume.Itisthereforeimportantforanyprefetchingtechniquetotakeintoaccounttheindividualcharacteristicsofeveryuser.Targetedvs.UntargetedwebFromtheysissofarweinferthattheURLsthatauservisitsfallintotwoclasses:thereisasmallnumberoffrequentlyvisitedURLs,andalongtailofinfrequentlyvisitedURLs.InordertothemastargetedanduntargetedURLs.WedefineatargetedURLtobeonewhichwasvisitedbytheuseratleast5timesinamonth.Wechosethisthresholdbycloselyyzingtheuserwebaccesslogs.Wefoundthatsmallerthresholds,suchas3,couldbetoopermissiveandcause50%oftheextremevolumeuserstohavemorethan50targetedURLs.Ontheotherhand,higherthresholdssuchas10couldbetooaggressiveandcause30%ofthelowvolumeuserstohave0targetedURLs.WeyzethevolumeofwebaccessesgeneratedbytargetedanduntargetedURLs.AsFigure2shows,althoughthetargetedac-cessesareonlyslightlymorethantheuntargetedaccessesforthelowvolumeusers,theURLaccessesaredominatedbytargetedac-cessesfortheremainingclassesofusers.Forexample,forhighvolumeusers,targetedaccessesaccountfor70%ofthetotalsmart-phoneURLaccesses.Figure2alsoprovidesmoreinsightontheaveragenumberofuniquetargetedURLsacrossthedifferentvolumeclassesandde-vicetypes(numberindicatedinsidethebarsinFigure2).Forlowandmediumvolumeusersandforbothfeaturephonesandsmart-

Figure3.Timeelapsedbetweenconsecutivesmartphonewebac-cesseswhenall,targeted,oruntargetedURLsareconsidered.Thetrendsareidenticalforfeaturephones(notshown).2and3respectively.Forhighandextremevolumeclasses,itin-creasesto9and12forfeaturephonesandsmartphonesrespectively.Inotherwords,2to12uniqueURLsare,oage,responsiblemobiledevicetoproperlymodelwhenandwhichofthesmallnum-beroftargetedURLswillbeaccessedbytheuserisofparamountimportanceforaneffectiveprefetchingpolicy.TimingofMobileWebWgesareconstantlyupdated.Foraprefetchingtechniquetobeeffective,itneedstopredictwhentheuser’swebaccesseswilltakeplace.Hence,westudythetemporalaccesspatternsofourusers.Figure3showstheelapsedtimebetweenconsecutivesmartphonewebaccessesfortargeted,untargetedandcombined(targetedanduntargeted)URLvisits.Approximay35%to50%oftargetedURLvisitsacrossthe4volumeclassesoccurwithin12minutes(0.2hoursinFigure3)ofthelasttargetedURLvisit.Additionally,25%to40%(dependingonthevolumeclass)oftargetedURLvisitstakeplacewithin6minutes(0.1hoursinFigure3)ofthelasttargetedURLvisit.Hence,atargetedURLaccesscanserveasagoodpredictorofthetimeatwhichanexttargetedURLaccesswilloccur.concentratedintimewhencomparedtotargetedURLs.Approxi-may70%to80%ofuntargetedURLvisits(asopposedto35%to50%oftargetedURLvisits)takeplacewithin12minutesofthelastuntargetedURLvisit.Inotherwords,whenmobileusersex-amountoftimeascomparedtowhenvisitingtargetedcontent.Wecanleveragethisinformationtoimprovetheaccuracyofprefetch-ingandsavebatteryresourcesbynotprefetchingtargetedcontentwhentheuserisabouttovisituntargetedURLs.APeekintoIndividualInadditiontorelativetiming,wealsostudytheroleofabsolutetim-inginmobilewebbrowsing(e.g.,timeofdaywhenURLaccessesoccur).Ingeneral,knowingwhentoexpectURLaccessescandrivecontentprefetching.Figure4showsthetimestampswithinadayofallURLaccessesthat4randomsmartphoneusersperformedover3months.RandomUser8RandomUserinefficientasitmightnotprovideenoughinformationtowhento64MobilewebbrowsingbehavioracrossuserscanvarygreatlyinthetypeandnumberofaccessedURLsaswellasthetimingofURLaccesses.URLURLURLURL00246810121416182022TimeofRandomUserURLURL00246810121416182022Timeof

00246810121416182022TimeofRandomUserURLURL00246810121416182022Timeof

vantageoftheunderlyingspatiotemporalpatternsofeachindivid-ualuser’swebbrowsingbehaviorisrequiredtoenabletimelyandaccuratecontentprefetching.ContentPrefetchingAsALearningMobilewebbrowsingbehaviorexhibitsseveralspatialandtem-poralproperties.Toenabletimelyprefetchingofwebcontent,theprefetchingschemeneedstocarefullymodelandlearnallthesedifferentpropertiesforeachindividualuser.However,optimallyOurapproachisinspiredbythewebsearchcommunity,Figure4.Webaccessesof4representativesmartphoneusersinthehighvolumeclass.Allaccessesoverthe3monthsareprojectedbluecirclesrepresenttargetedURLvisits.Thevarianceinmobilewebaccesspatternsacrossthe4userstargetedURLs.Mostlikely,awebcontentprefetchingtechniquewillhavedifficultyinmodelingtheseusers’webbrowsingpatternsaccuray,asithasnowayofpredictingtheuntargetedaccesses.Interestingly,however,users1and4accesswgesfromtheirphonesatgiventimeintervalswithintheday(e.g.,user4’saccessesareonlybetween6amand9am,9pmand11pm,andmidnightand2am).Ontheotherhand,forusers2and3,webaccessesaredom-inatedbyasmallsetoftargetedURLs(2targetedURLsforuser2and7foruser3).Moreimportantly,thesinglemostfrequentlyvisitedtargetedURLisresponsibleforthemajorityofthatuser’saccessthissingletargetedURLperiodicallythroughouttheday.IfaprefetchingpolicycanpredictwhenthetargetedURLwillbeac-cessedbytheusersbasedontheirperiodicaccesses,itcanbeveryeffectiveinprovidinganinstantmobilebrowsingexperience.SumndKeyTheysisofrealwebaccesslogsfrom8,000usershashigh-lighteddifferentaspectsofmobilewebbrowsingbehaviorthatarecriticaltocontentprefetching:AsmallnumberoftargetedURLsisresponsibleforthemajor-ityofauser’sURLvisits.PredictingthesetargetedURLac-cessescanhaveahugeimpactontheuser’sbrowsingexperi-TargetedURLaccessesareclusteredintime.MobileuserstendrecenttargetedURLaccessescanbestrongindicatorsoffutureURLvisits.UntargetedURLaccessesaresignificantlymoreclusteredintimethantargetedURLaccesses.RecentuntargetedURLac-cessescanhelpusdecideaboutthetypeoffutureURLaccesses(targetedvs.untargeted).prefetchcontentinatimelymanner.Prefetchingbasedonlyonpastsequences(or,moregenerally,sets)ofwebaccessescanbe

multiplehundredsorthousandsoffeaturesarecombinedtoranksubmitsaqueryandthesearchengineranksasetofURLstoshowthemostrelevantoneshigherupintheresultpage.Therankingproblemisoftenformulatedasaclickpredictionproblem,whereforeveryrelatedURL,theenginehastoestimatetheprobabilityofauserclickonthatURL.Thehighertheprobability,thehighertherankoftheURL.Tocreatetheclickpredictionmodel,searcheveryURLdisplayedtotheuser,variousfeaturesarecomputed,encodinginformationabouttheuser,thequery,theURLorallofclicked)oranon-click(ifitwasnotclicked).Theclickpredictionmodelisthentrainedusingmillionsoftheselabeledfeaturevec-Inwebcontentprefetching,thewebsearchclicklogsareplacedbytheuser’swebaccesslogs.TheURLsarenolongerwebsearchresults,butthetargetedURLsidentifiedintheuser’swebaccesslogs.Theclickpredictionmodelisnowturnedintoawebaccesspredictionmodelwhoseroleistoassign,atanygiventime,anaccessprobabilitytoeachtargetedURL.Thehighertheprob-ability,themorelikelytheuseristorequestaccesstothisURL.Thefeaturesusedtotrainthepredictionmodelarethemostcriticaltheuser’swebaccesslogs.Converselytowebsearch,inwebcontentprefetchingtheuserdoesnotexplicitlysubmitaquery.Thus,todeterminewhentheevent-drivenapproachwherethemobiledevicemakeswebaccesspredictionsasaresultofcertainuseractions.Forinstance,themobiledevicemakesapredictioneverytimetheuserunlocksthephone,activatesthebrowserorvisitsaURL.Dependingontheresultingprobabilities,thephonedecideswhethertoprefetchanyapproach.Offline,themobiledevicerecordsuserwebaccesses,includingautomaticpagerefreshes,andperiodically(e.g.,weekly,monthly)usesthisinformationtobuildawebaccesspredictionmodelfortheuser.Thismodelcanbebuiltonthemobiledevicecloud.First,asetoffeaturesisextractedforeverytargetedURLinauser’sweblogs.Theroleofthesefeaturesistoencodetheunderlyingstructureofmobilewebbrowsingbehaviorintermsofthewebaccesslogsaremappedtoasetoffeaturevectorsthatareannotatedasaccessesornon-accesses.U

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论