传统电脑决策.ppt_第1页
传统电脑决策.ppt_第2页
传统电脑决策.ppt_第3页
传统电脑决策.ppt_第4页
传统电脑决策.ppt_第5页
已阅读5页,还剩146页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

TheApplicationofFuzzySetsonDataMining Prof Tzung PeiHongDepartmentofElectricalEngineeringNationalUniversityofKaohsiung Outline IntroductionReviewDataMiningFuzzySetsFuzzyDataMiningFuzzyAssociationRules I FuzzyAssociationRules II FuzzyGeneralizedAssociationRulesFuzzyWebMining Outline ReviewofGAFuzzydataminingformembershipfunctionsandrulesTwoapproachesConclusion Reasonsfordatamining Reasonsfordatamining Miningassociationrules Bread Milk IFbreadisboughtthenmilkisbought Theroleofdatamining Usefulpatterns Transactiondata Preprocessdata DataMining Knowledgeandstrategy Differentkindsofknowledge AssociationrulesGeneralizedassociationrulesSequentialpatternsQuantitativeassociationrulesClassificationrulesClusteringrulesetc Focus Miningassociationrules Bread Milk IFbreadisboughtthenmilkisbought AprioriAlgorithm ProposedbyAgrawaletal Step1 Defineminsupandminconfex minsup 50 minconf 50 Step2 FindlargeitemsetsStep3 Generateassociationrules Example Largeitemsets ScanDatabase ScanDatabase ScanDatabase Itemset Sup A 2 B 3 C 3 E 3 L 1 Example FuzzySets 傳統電腦決策不是對 1 就是錯 0 例如 25歲以上是青年 那26歲就是中年 60分以上是及格 那60分以下就是不及格何謂模糊在對 1 與錯 0 之間 再多加幾個等級幾乎對 0 8 可能對 0 6 可能錯 0 4 幾乎錯 0 2 FuzzySets Question 168公分到底算不算高 身高 Cm 中 矮 高 170 180 160 隸屬度 再多分成幾級 連續 Example Closeto0 e g A 3 0 01 A 1 0 09 A 0 25 0 62 A 0 1DefineaMembershipFunction A x Example Closeto0 VeryCloseto0 A x FuzzySet Cont Membershipfunction 0 1 e g sunny x 0 1 FuzzySet SimpleIntuitivelypleasingAgeneralizationofcrispsetVague member non member 0or1 Non membermember gradual FuzzyOperations 交集 AND 取較小的可能性EX 學生聰明 0 8 而且用功 0 6 則是模範生 0 6 聯集 OR 取較大的可能性EX 學生聰明 0 8 或者用功 0 6 則是模範生 0 8 反面 NOT 取與1的差EX 學生聰明是0 8 則學生不聰明0 2 FuzzyInferenceExample 洪老師找小老婆的條件 大眼睛而且小嘴巴 或者是身材好Question 誰是最佳女主角 大眼睛小嘴巴身材好陶晶瑩00 80 3張惠妹10 60 8李玟00 30 9李心潔0 70 10 5蔡依林0 80 50 3 Answer 對陶晶瑩 0AND0 8 OR0 3 0OR0 3 0 3對張惠妹 1AND0 6 OR0 8 0 8對李玟 0AND0 3 OR0 9 0 9對李心潔 0 7AND0 1 OR0 5 0 5對蔡依林 0 8AND0 5 OR0 3 0 5李玟為最佳選擇 謝謝 FuzzyDecision A A1 A2 A3 A4 A5 AsetofalternativesC C1 C2 C3 Asetofcriteria Example Cont Assume C1andC2orC3E Ai evaluationfunctionE A1 0 0 8 0 3 0 0 3 0 3E A2 1 0 6 0 8 0 6 0 8 0 8E A3 0 0 3 0 9 0 0 9 0 9 thebestchoiceE A4 0 7 0 1 0 5 0 1 0 5 0 5E A5 0 8 0 5 0 3 0 5 0 3 0 5 MotivationforFuzzyMining Inreal worldapplicationsTransactionswithquantitativevaluesUsingfuzzysetstoprocessit FuzzyDataMining Solvingquantitativevaluese g Johnbuys10bread 2butterand3Milk FuzzyDataMiningMuchBreadandLittleButter MiddleAmountofMilk Fuzzydatamining Quantitativedata Linguisticterm Membershipfunction 01611 Low Middle High 10 Membershipvalue Numberofitem MainIdea NumericDatabase FuzzyDataMining Knowledge Ifmilk MiddleThencookies Low Relatedresearch LeeandHyung 1997 cutConvertingthemembershiptuplestobinarytuplesPedrycz 1996 1998RunningtheFCM FuzzyC Means methodSolvingclusterproblem Relatedresearch Chanandhisco workers 1997Maddourietal 1998Rubin 1998Hanetal 1998UsingMachinelearningmethodCombiningdatamining FuzzyMiningAlgorithm InputnquantitativetransactiondatamattributesAsetofmembershipfunctionsTwothresholdsMinimumsupport Minimumconfidence OutputAsetoffuzzyassociationrules FuzzyMiningAlgorithm Step1Transformthequantitativevalueofeachtransactiondatumintoafuzzysetusingthegivenmembershipfunctions Step2Calculatethescalarcardinalityofeachattributeregioninthetransactiondata FuzzyMiningAlgorithm Step3Foreachfuzzyregion checkwhetheritisinthesetoflarge1 itemsets L1 Step4Setr 1 whererisusedtorepresentthenumberofitemskeptinthecurrentlargeitemsets FuzzyMiningAlgorithm Step5GeneratethecandidatesetCr 1fromLrinawaysimilartothatintheapriorialgorithmexceptthattworegionsbelongingtothesameattributecannotsimultaneouslyexistinanitemsetinCr 1 FuzzyMiningAlgorithm Step6Dothefollowingsubstepsforeachnewlyformedcandidate r 1 itemsetswithitemsinCr 1 Step6 1 Calculatethefuzzyvalueofeachtransactiondatains Step6 2 Calculatethescalarcardinalityofsonthetransactions Step6 3 Ifcountsislargerthanorequaltothepredefinedminimumsupportvalue putsinLr 1 FuzzyMiningAlgorithm Step7IFLr 1isnull thendothenextstep otherwise setr r 1andrepeatSTEPs5to7 Step8Constructtheassociationrulesforalllargeq itemsetswithitems usingthefollowingsubsteps Step8 1 Formallpossibleassociationrule Step8 2 Calculatetheconfidencevaluesofallassociationrules Step9Outputtheruleswithconfidencevalueslargerthanorequaltothepredefinedconfidencethreshold AnExample GeneratingassociationrulesforcoursegradesAccordingtohistoricaldataconcerningstudents coursescores Thedataset10transactions EachcaseconsistsoffivecoursescoresObject OrientedProgramming denotedOOP Database denotedDB Statistics denotedST DataStructure denotedDS ManagementInformationSystem denotedMIS Transactions Thesetofstudents coursescoresintheexampleMinsup 1 5 Minconf 0 7 MembershipFunctions STEP1 Transformthequantitativevaluesofeachtransactiondatumintofuzzysets e g Score 86 STEP2 Calculatethescalarcardinalityofeachattributeregioninthetransactionsasthecountvalue STEP3 Checking count minimumsupportvalue 1 5 ThecontentsofL1forthisexample AnExample STEP4Setr 1STEP5GeneratethecandidatesetCr 1fromLre g OOP Middle DB Middle OOP Middle DB High OOP Middle ST Middle OOP Middle ST High OOP Middle DS Middle OOP Middle DS High OOP Middle MIS Middle OOP Middle MIS High DS High MIS Middle and DS High MIS High STEP6 DothefollowingsubstepsforeachnewlyformedcandidateitemsetStep6 1Calculatethefuzzymembershipvalueofeachtransactiondatume g STEP6 Step6 2Calculatethescalarcardinality count ofeachcandidate2 itemsetinthetransactiondata STEP6 Step6 3Checkwhetherthesecountsarelargerthanorequaltothepredefinedminimumsupportvalue1 5Result OOP Middle DB High OOP Middle ST Middle OOP Middle ST High OOP Middle DS Middle OOP Middle DS High OOP High DB Middle OOP High MIS High DB Middle MIS High DB High ST High DB High DS Middle ST High DS Middle ST High MIS Middle ST High MIS High and DS Middle MIS Middle STEP7 IFLr 1isnull thendothenextstep otherwise setr r 1andrepeatSTEPs5to7e g TheitemsetsandtheirsupportvaluesinL3 STEP8 ConstructtheassociationrulesforeachlargeitemsetusingthefollowingsubstepsStep8 1Formallpossibleassociationrulese g OOP Middle DB High DS Middle IfOOP MiddleandDB HighthenDS Middle IfOOP MiddleandDS MiddlethenDB High IfDB HighandDS MiddlethenOOP Middle STEP8 Step8 2Calculatetheconfidencefactorsfortheaboveassociationrulese g RuleIfOOP MiddleandDB HighthenDS MiddleThecountsofOOP Middle DB HighOOP Middle DB High DS Middle Confidence 0 73 STEP9 Checkwhethertheconfidencefactorsoftheaboveassociationrulesarelargerthanorequaltothepredefinedconfidencethreshold Fuzzyassociationrules 12rulesIf OOP MiddleandDS Middle then DB High conf 0 94If ST Middle then OOP Middle conf 0 94If DB HighandST High then DS Middle conf 0 86If MIS Middle then ST High conf 0 83If DS Middle then MIS Middle conf 0 78If OOP High then DB Middle conf 0 74If OOP MiddleandDB High then DS Middle conf 0 73If DS Middle then ST High conf 0 72If DB Middle then OOP High conf 0 71If DB High then DS Middle conf 0 71If DB HighandDS Middle then ST High conf 0 70If OOP High then MIS High conf 0 70 AnotherVariant Useonlytheregionwiththemaximumfuzzycardinalityforeachitem Results L1 L2 Results LessnumberoflargeitemsetsNotcompleteLesscomputationtime L3 Taxonomy Onlytheterminalitemscanappearintransactiondata Miningundertaxonomy ManyApproachesHere byAgrawalandSrikant sapproachStep1 Defineminsupandminconfex minsup 50 minconf 50 Step2 GenerateexpandedtransactiondataStep3 FindlargeitemsetsStep4 GenerateassociationrulesStep5 Checkinterestofassociationrules Step2 Generateexpandedtransactionancestorsareadded Step5 RuleInterestCriteria 1 Arulewithnoancestorrulesminedout 2 ThesupportvalueofarulebeingR timelargerthantheexpectedsupportvaluesofitsancestorrules3 TheconfidencevalueofarulebeingR timelargerthantheexpectedconfidencevaluesofitsancestorrules Algorithm InputQuantitativetransactiondatasetMembershipfunctionsTaxonomyThreeThresholdsMin supportMin confidenceR interestOutputFuzzyinterestinggeneralizedassociationrules Example QuantitativetransactiondatasetTransactionIDSomepurchaseditemswithquantitiesTaxonomyOnlytheterminalitemscanappearintransactiondata T2T1CAB T3DE Taxonomy Step1 GenerateexpandedtransactionAncestorsareappendedQuantitiesareadded Step2 TransformthequantitativedataintofuzzydataUsingthegivenmembershipfunctionsTakethefirstitemintransaction4asanexample 016911 Low Middle High 10 Membershipvalue Numberofitem Resultafterstep2 Thefuzzysetstransformedfromthequantitativedata Step3 Calculatethescalarcardinalityofeachfuzzyregion Step4 GeneratelargeitemsetsL1Assumetheminimumsupportvalueis1 5 Step5 GeneratecandidatesandcalculatefuzzycardinalityByminimumoperatorse g B Low C Middle 1 4 Step6 GeneratelargeitemsetsL2Assumetheminimumsupportvalueis1 5NoL3 L2 Step7 CalculateconfidenceandgeneraterulesByconditionalprobabilityconfidencevalue min confthresholdAssumemin conf 0 75TherearethreerulesinthisexampleIfC Middle thenT1 Low 2 0 0 77 IfT1 Middle thenT3 Middle 1 6 0 8 IfT3 Middle thenT2 High 2 4 0 86 IfT3 High thenT2 High 1 8 0 82 Step8 CheckwhethertherulesareinterestingAlltheruleshasnoancestorrulesFinalResults IfC Middle thenT1 Low 2 0 0 77 IfT1 Middle thenT3 Middle 1 6 0 8 IfT3 Middle thenT2 High 2 4 0 86 IfT3 High thenT2 High 1 8 0 82 AnotherVariant Useonlytheregionwiththemaximumfuzzycardinalityforeachitem Resultafterstep3 LessnumberoflargeitemsetsNotcompleteLesscomputationtime L1 L2 Experiments InConaPentium IIIThreelevels fourbranchesNumbersofpurchaseditemsarefirstrandomlygeneratedPurchaseditemsandquantitiesarethenrandomlygenerated Experiments For10000transactionsTherelationshipsbetweennumbersofrulesandminimumsupportvaluesandconfidencevalues Support Rules Confidence Rules Experiments DifferentaveragenumbersofitemsintransactionsAveragenumberofitems Rules Experiments Averagenumberofitems time Experiments FordifferentnumbersoftransactionsNumbersoftransactions Rules Experiments Numbersoftransactions Time Experiment AcomparisonoftheproposedapproachesRulesaremorecompletebythefirstapproach Experiment AcomparisonoftheproposedapproachesMoreTimeisspentbythefirstapproach Experiment AcomparisonoffuzzyeffectsBypredictionaccuracy Fuzzypartitionisbetterthancrisppartition WebMining Logdata WebMining Knowledgeandstrategy Browsingpatterns Webmining web contentminingfocusesoninformationdiscoveryfromsourcesacrosstheWorldWideWebe g Miningpage keywordrelationsfromwebpagesweb usageminingemphasizesontheautomaticdiscoveryofuseraccesspatternsfromwebserverse g Miningpagebrowsingpatternsfromlogfiles Goal Proposingaweb usageminingalgorithmFindinglinguisticbrowsingbehaviorsfromdatalogsonwebserverse g IfbrowsingpageAalongtimeThennextbrowsingpageBashorttimeAdoptingfuzzyconceptsinanalyzingthebrowsingtime Relatedwork WebcontentminingWebusagemining Relatedwork MiningsequentialpatternFuzzysettheoryandfuzzydatamining AprioriAllAlgorithm ProposedbyAgrawalandSrikantForfindingsequentialpatternsFivephasesFormcustomersequencesFindlargeitemsetsMapitemsetsintointegersFindlargesequencesFindmaximallylargesequences AprioriAllAlgorithm ProposedbyAgrawalandSrikantForfindingsequentialpatternse g Customer110 00BCDCustomer211 00BCCustomer115 00BCDECustomer217 00EFCustomer1 BCD BCDECustomer2 BC EF BC E Example Support 60 Example Fuzzyweb miningalgorithm InputLogdataMembershipfunctionsForconvertingbrowsingdurationsintolinguistictermsThresholdMin supOutputFuzzybrowsingpatterns Logdata Ouralgorithm Step1Thefollowingfilenamesareselected asp htm html jva cgiandclosingconnectionThefollowingfourfieldsarekeptdate time cilent ipandfile name Ouralgorithm Step2Thevaluesoffieldclient iparetransformedintocontiguousintegersforconvenenceStep3ThelogdatasortedfirstbyencodedclientIDandthenbydateandtime Ouralgorithm Step4ThetimedurationsofthewebpagesbrowsedbyeachencodedclientIDarecalculatede g 2001 03 01 05 39 56 2001 03 01 05 40 2630seconds Ouralgorithm Step5Thewebpagesbrowsedbyeachclientarelistedtoformbrowsingsequence Ouralgorithm Step6ThetimedurationsarerepresentedasfuzzysetsUsingthegivenmembershipfunctionse g theseconditem C 101 inClient4 0207080101130 Short Middle Long 10 Browsingduration Resultsafterstep6 Thefuzzysetstransformedfromthequantitativedata Ouralgorithm Step7 Themaximummembershipvalueforeachregionineachsequenceisfounde g Client2 D Middle max 0 8 0 0 0 6 0 8 Ouralgorithm Step8 Thesupportvalueofeachregioniscalculatede g D Middle D Middle 0 6 0 8 1 0 0 0 2 4 Ouralgorithm Steps9 11 Large1 sequencesaregeneratede g AssumeMin sup 1 8B Short C Middle D MiddleSteps12 15 Largek sequencesaregeneratedCandidate2 itemsetsB Short C MiddleC Middle B ShortB Short D MiddleD Middle B Short Supportvalueofcompositeregions ThesupportvalueofeachcompositeregioniscalculatedForexample client4 B Short C Middle max min 1 0 0 6 min 1 0 0 4 min 1 0 0 4 0 6 Support 0 8 0 0 0 8 0 6 2 2 Ouralgorithm Large2 sequences B Short C Middle D Middle B Short D Middle C Middle Inthisexample nolarge3 sequencesexist Ouralgorithm Step16 MaximallylargesequencesareoutputForexampleABC AB BC ACOutput ABCInthisexample B Short C Middle D Middle B Short D Middle C Middle MiningMembershipFunctions FuzzyAssociationRules Transactiondata FuzzyMining GoodEnough GAsInterestingrules BadkindsofmembershipMFs QuantitativeData FuzzyAssociationRules RelatedStudy Wang HongandTseng KnowledgeIntegration ExpertSystem 4 Reducetheeffortondevelopinganexpertsystemordecisionsupportsystem 1 Knowledgeisdistributedamongsources RB1 RBi RBn GRB UserInterface Integration 3 Knowledgecanbereused 2 ItIncreasesreliabilityofknowledge basedsystems GeneticKnowledge IntegrationFramework WhyUsingGAs Integration RB1 RBi RBn Integration mustsatisfy 1 Completeness2 Correctness3 Consistency4 Conciseness Multi objectiveoptimizationproblem GAs findingoptimalornearlyoptimalsolutions Relatedstudies Wangetal 2000 TuningmembershipfunctionsforintrusiondetectionsystemsbyGAsBasedonsimilarityofassociationrules ReviewofRelatedStudies Cont Kayaetal 2003 Deriveapredefinednumberofmembershipfunctions byGAsGettingamaximumprofitwithinanintervalofuserspecifiedminimumsupportvalues HistoryofGAs GA GeneticAlgorithmHistory JohnHolland 1975 K A DeJong D E Goldberg IdeaofGA SurvivalofthefittestIterativeProcedureGeneticoperatorsReproductionCrossoverMutationNearoptimalsolution SimpleGeneticAlgorithms AnExample AFunctionFindthemax Step1 DefineasuitablerepresentationEachChromosome12bitse g t 0 000000000000t 1 111111111111t 0 680 101011100001 Step2 CreateaninitialpopulationofNN PopulationsizeAssumeN 40 Step3 DefineasuitablefitnessfunctionftoevaluatetheindividualsFitnessfunction f t e g Thefirstsixindividuals Step4 Performthecrossoverandthemutationoperationstogeneratethepossibleoffsprings Crossover Offsprings Inheritingsomecharacteristicsoftheirparentse g Parent1 000110000001Parent2 010011001101 Child1 000111001101Child2 010010000001 Mutation OffspringspossessingdifferentcharacteristicsfromtheirascendentsPreservingareasonablelevelofpopulationdiversitye g Bitchangee g Inversion 111100000100 011100000100 111101000100 111011000100 NewOffsprings Thenewoffspringsproducedbytheoperators Step5 Replacetheindividuale g Thefirstsixindividuals NEW Step6 Iftheterminationcriteriaarenotsatisfied gotoStep4 otherwise stopthegeneticalgorithmTheterminationcriteriaThemaximumnumberofgenerationsThetimelimitThepopulationconverged Experiment FirstApproach Framework Chromosome1 MFAcquisitionprocess linguisticterms Membership GeneticFuzzy FuzzyMining forLarge1 itemsets FinalMembershipFunctionSet Minimumsupport Minimumconfidence FunctionSet1 Membership FunctionSet2 Membership Membership FunctionSetq FunctionSet3 Chromosome2 Chromosome3 Chromosomeq Transaction Database Population FuzzyMining MiningMembershipFunctions MiningFuzzyAssociationRules ChromosomeRepresentation Example FitnessFunction Formally BetterSuitability Cq OverlapFactor CoverageFactorThetwobadkindsofmembershipfunctions Suitability FormallySuitability Cq Whereoverlap factor coverage factor OverlapFactor 1 1 Acceptable Badshape CoverageFactor Item breadQuantity 1 2 3 8 16 17 18 18 H 1 8 L 0 4 M 0 6 2 L 0 4 18 H 1 8 L 1 2 18 H 1 8 2 coveragerange 1 18 coveragerange 4 18 coveragerange 4 6 10 14 16 18 L M H AnExample 6TransactionsInitialmembershipfunctions 3 6 9 0 3 6 9 0 3 6 9 0 milk 5 10 15 Low Middle High Quantity 0 Membership value cookies Low Middle High Quantity Membership value beverage 4 8 12 Low Middle High Quantity 0 Membership value 5 10 15 Low Middle High Quantity 0 5 10 15 Low Middle High Quantity 0 Low Middle High Membership value Low Middle High Membership value 4 8 12 Low Middle High Quantity 0 4 8 12 L

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论