




已阅读5页,还剩2页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
function D = C4_5(train_features, train_targets, inc_node, region) % Classify using Quinlans C4.5 algorithm% Inputs:% features - Train features% targets - Train targets% inc_node - Percentage of incorrectly assigned samples at a node% region - Decision region vector: -x x -y y number_of_points% Outputs% D - Decision sufrace %NOTE: In this implementation it is assumed that a feature vector with fewer than 10 unique values (the parameter Nu)%is discrete, and will be treated as such. Other vectors will be treated as continuous Ni, M = size(train_features);inc_node = inc_node*M/100;Nu = 10; %For the decision regionN = region(5);mx = ones(N,1) * linspace (region(1),region(2),N);my = linspace (region(3),region(4),N) * ones(1,N);flatxy = mx(:), my(:); %Preprocessing%f, t, UW, m = PCA(train_features, train_targets, Ni, region);%train_features = UW * (train_features - m*ones(1,M);%flatxy = UW * (flatxy - m*ones(1,N2); %Find which of the input features are discrete, and discretisize the corresponding%dimension on the decision regiondiscrete_dim = zeros(1,Ni);for i = 1:Ni, Nb = length(unique(train_features(i,:); if (Nb = Nu), %This is a discrete feature discrete_dim(i) = Nb; H, flatxy(i,:) = high_histogram(flatxy(i,:), Nb); endend %Build the tree recursivelydisp(Building tree)tree = make_tree(train_features, train_targets, inc_node, discrete_dim, max(discrete_dim), 0); %Make the decision region according to the treedisp(Building decision surface using the tree)targets = use_tree(flatxy, 1:N2, tree, discrete_dim, unique(train_targets); D = reshape(targets,N,N);%END function targets = use_tree(features, indices, tree, discrete_dim, Uc)%Classify recursively using a tree targets = zeros(1, size(features,2); if (tree.dim = 0) %Reached the end of the tree targets(indices) = tree.child; breakend %This is not the last level of the tree, so:%First, find the dimension we are to work ondim = tree.dim;dims= 1:size(features,1); %And classify according to itif (discrete_dim(dim) = 0), %Continuous feature in = indices(find(features(dim, indices) tree.split_loc); targets = targets + use_tree(features(dims, :), in, tree.child(2), discrete_dim(dims), Uc);else %Discrete feature Uf = unique(features(dim,:); for i = 1:length(Uf), in = indices(find(features(dim, indices) = Uf(i); targets = targets + use_tree(features(dims, :), in, tree.child(i), discrete_dim(dims), Uc); endend %END use_tree function tree = make_tree(features, targets, inc_node, discrete_dim, maxNbin, base)%Build a tree recursively Ni, L = size(features);Uc = unique(targets);tree.dim = 0;%tree.child(1:maxNbin) = zeros(1,maxNbin);tree.split_loc = inf; if isempty(features), breakend %When to stop: If the dimension is one or the number of examples is smallif (inc_node L) | (L = 1) | (length(Uc) = 1), H = hist(targets, length(Uc); m, largest = max(H); tree.child = Uc(largest); breakend %Compute the nodes Ifor i = 1:length(Uc), Pnode(i) = length(find(targets = Uc(i) / L;endInode = -sum(Pnode.*log(Pnode)/log(2); %For each dimension, compute the gain ratio impurity%This is done separately for discrete and continuous featuresdelta_Ib = zeros(1, Ni);split_loc = ones(1, Ni)*inf; for i = 1:Ni, data = features(i,:); Nbins = length(unique(data); if (discrete_dim(i), %This is a discrete feature P = zeros(length(Uc), Nbins); for j = 1:length(Uc), for k = 1:Nbins, indices = find(targets = Uc(j) & (features(i,:) = k); P(j,k) = length(indices); end end Pk = sum(P); P = P/L; Pk = Pk/sum(Pk); info = sum(-P.*log(eps+P)/log(2); delta_Ib(i) = (Inode-sum(Pk.*info)/-sum(Pk.*log(eps+Pk)/log(2); else %This is a continuous feature P = zeros(length(Uc), 2); %Sort the features sorted_data, indices = sort(data); sorted_targets = targets(indices); %Calculate the information for each possible split I = zeros(1, L-1); for j = 1:L-1, for k =1:length(Uc), P(k,1) = length(find(sorted_targets(1:j) = Uc(k); P(k,2) = length(find(sorted_targets(j+1:end) = Uc(k); end Ps = sum(P)/L; P = P/L; info = sum(-P.*log(eps+P)/log(2); I(j) = Inode - sum(info.*Ps); end delta_Ib(i), s = max(I); split_loc(i) = sorted_data(s); endend %Find the dimension minimizing delta_Ib m, dim = max(delta_Ib);dims = 1:Ni;tree.dim = dim; %Split along the dim dimensionNf = unique(features(dim,:);Nbins = length(Nf);if (discrete_dim(dim), %Discrete feature for i = 1:Nbins, indices = find(features(dim, :) = Nf(i); tree.child(i) = make_tree(features(dims, indices), targets(indices), inc_node, discrete_dim(dims), maxNbin, base); endelse %Continuous feature tree.split_loc = split_loc(dim); indices1 = find(features(dim,:)
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 农发行益阳市桃江县2025秋招群面案例总结模板
- 农发行酒泉市瓜州县2025秋招笔试性格测试题专练及答案
- 农发行马鞍山市当涂县2025秋招数据分析师笔试题及答案
- 国家能源杭州市建德市2025秋招笔试综合知识题专练及答案
- 国家能源鸡西市梨树区2025秋招半结构化面试模拟30问及答案
- 分房协议书(15篇)
- 农村自建房屋买卖合同15篇
- 2025年安康市市本级就业见习岗位(458人)模拟试卷附答案详解(考试直接用)
- 巴彦淖尔市中石化2025秋招笔试模拟题含答案机械与动力工程岗
- 国家能源中山市2025秋招面试专业追问及参考化学工程岗位
- 存款代持协议书范文模板
- 建筑施工企业施工项目安全生产标准化考评表
- 电梯使用单位电梯安全总监职责和电梯安全员守则
- 足太阳膀胱经(经络腧穴课件)
- 沟通的艺术智慧树知到期末考试答案章节答案2024年湖南师范大学
- 2024年四川省广安市中考数学试题(含答案逐题解析)
- 员工上下班交通安全知识培训课件
- 产品质量法-企业培训讲座
- 塑胶模具报价表范例
- 三阶魔方七步还原图文教程
- 肌肉注射评分标准
评论
0/150
提交评论