哈夫曼算法及其应用_第1页
哈夫曼算法及其应用_第2页
哈夫曼算法及其应用_第3页
哈夫曼算法及其应用_第4页
哈夫曼算法及其应用_第5页
已阅读5页,还剩8页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、哈夫曼算法及其应用一、问题描述给定n个权值作为n个叶子结点,构造一棵二叉树,若带权路径长度达到最小,称这样 的二叉树为最优二叉树,也称为哈夫曼树。哈夫曼编码是一种根据哈夫曼树对文件进行编码 的方式。哈夫曼编码是可变字长编码的一种。本次课程设计是对一个已建文本文件,统计该 文件中各字符频率,对各字符进行Huffman编码,将该文件翻译成Huffman编码文件,再将 Huffman编码文件翻译成原文件。压缩文件即读文件,统计文件中的字符个数,对文件进行 哈夫曼编码和译码,并将编码译码后的字符存储在文件中。二、基本要求程序要求实现以下功能:统计文本文件中各字符的出现次数(涉及读文件,统计字符个数);

2、对文件中的字符进行哈夫曼编码,并存储入字符编码文件;.根据字符编码文件对文本文件内容进行编码;.根据字符编码文件和已编码文件的内容进行译码;5.能够输出原文、编码表、文本文件编码、译文。三、测试数据In its medical literature, the Food and Drug Administration states that hot water comfortable enough for washing hands is not hot enough to kill bacteria, but is more effective than cold water because

3、itremoves oils from the hand that can harbor bacteria.四、算法思想1、哈夫曼树建立算法:1)根据给定的n个权值W1,W2, W3Wn构成n棵二叉树的集合T1,T2, Tn,其中Ti中只有一个权值为Wi的根结点,左右子树均为空。2)在F中选取两棵根结点的权值最小的树作为左、右子树一构造一棵新的二叉树,且 置新的二叉树的根结点的权值为左、右子树上根结点的权值之和。3)在F中删除这两棵中权值最小的树,同时将新得到的二叉树加入F中。4)重复2) 3)直到F中仅剩一棵树为止,这棵树就是哈夫曼树。2、哈夫曼编码算法:通过从哈夫曼树根结点开始,对左子树分

4、配代码“ 1”,右子树分配代码“0”,一直到 达叶子结点为止,然后将从树根沿每条路径到达叶子结点的代码排列起来,便得到了哈夫曼 编码。3、对文件字符编码算法:逐一读取文件中字符,在哈夫曼编码表查找对应字符,读取其编码并写入文件,如此循 环直至结束。4、哈夫曼译码算法:根据编码用的哈夫曼树,从根结点出发,逐个读入电文中的二进制码;若代码为“1”, 则走左子树的根结点,否则走向右子树的根结点;一旦到达叶子结点,便译出代码所对应的 字符。然后又重新从根结点开始继续译码,直到二进制电文结束。五、模块划分Void InitHT(HuffmanT T)初始化Huffman树。Void SelectMin(

5、HuffmanT T, int n, int &p1, int &p2)找到权重最小的叶子。Void LoadHuffmanFile(HuffmanT T)加载文件。Void CreatHT(HuffmanT T)构造Huffman树。Void CharSetHuffmanEncoding(HuffmanT T, HuffmanCode H)根据Huffman树求Huffman编码表。Void EncodingHuffmanT(HuffmanT T, HuffmanCode H)对文件编码。Void DecdingHuffmanT(HuffmanT T, HuffmanCode H)根据Huf

6、fman编码、译码。Void PrintHuffmanT(HuffmanT T)打印Huffman权重表。Void PrintHuffmanH(HuffmanT T, HuffmanCode H)打印Huffman编码表。Void MainMenue ()主菜单。提供相关的操作提示。Int main ()主函数。用个while循环和switch选择结构进行进行循环交互性操作。六、数据结构/(ADT)1、哈夫曼树的存储结构:typedef struct char ch;/字符int weight;/字符权重int lchild;/左子int rchild;int parent; THNODE;2

7、、哈夫曼编码表的存储结构: typedef struct char ch;char bitsMAX_C + 1; CodeNode;七、源程序/Huffman.cpp源代码如下:#include #include #include #define MAX_C 256#define MAX_N 512#define N 50/Huffman Tree 结构*/ typedef struct char ch;int weight;int lchild;int rchild;int parent;THNODE;typedef THNODE HuffmanTMAX_N;/*Huffman 编码表结构*

8、/ typedef struct char ch;char bitsMAX_C + 1;CodeNode;/右子/双亲/存储字符/字符编码位串/定义最大字符数/定义最大Huffman节点个数/字符/字符权重左子/右子/双亲/存储字符/字符编码位串typedef CodeNode HuffmanCodeMAX_C;HuffmanCode H;/* 全局变量 */int n;/指示待编译文件的字长char filename20;/*初始化Huffman树*/void InitHT(HuffmanT T) int i;for (i = 0; i MAX_N; i+) Ti.ch = 0;Ti.wei

9、ght = 0;Ti.lchild = -1;Ti.rchild = -1;Ti.parent = -1;/*找到权重最小的叶子*/void SelectMin(HuffmanT T, int n, int &p1, int &p2) int i;int j;for (i = 0; i 0)p1 = i;break;for (j = i + 1; j 0)p2 = j;break;for (i = 0; i Ti.weight) & (Ti.parent = -1) & (p2 != i) & (Ti.weight 0)p1 = i;for (j = 0; j Tj.weight) & (Tj

10、.parent = -1) & (p1 != j) & (Tj.weight 0)p2 = j;/* 加载文件 */void LoadHuffmanFile(HuffmanT T) unsigned int i; int j = 0; char c;int aMAX_C;FILE *fp;printf(Input file name:);scanf(%s”, filename); if (fp = fopen(filename, rb) = NULL) printf(Cant open %sn, filename);exit( 0 );for (i = 0; i MAX_C; i+) ai =

11、 0;fseek(fp, 0, SEEK_SET); while ( 1 )/( !feof(fp) fread(&c, sizeof(unsigned char), 1, fp); if (feof(fp) break;a(unsigned int)c+;fclose(fp);/*统计输入文件的字符及其权重并存放到树T*/ for (i = 0; i MAX_C; i+) if (ai != 0)Tj.ch = (unsigned char)i;Tj+.weight = (unsigned int)ai; n = j;/*构造 huffam 树,T2 * n - 1为其根*/ void Cr

12、eatHT(HuffmanT T) int i,p1,p2;LoadHuffmanFile(T);/加载被编码文件for (i = n; i 2 * n - 1; i+) SelectMin(T, i - 1, p1, p2);Tp1.parent = Tp2.parent = i;Ti.lchild = p1;Ti.rchild = p2;Ti.weight = Tp1.weight + Tp2 .weight;/*根据 Huffman T 求 Huffman 编码表 H*/ void CharSetHuffmanEncoding(HuffmanT T, HuffmanCode H) int

13、 c;int p;int i;int start;char cdN;/指示T中孩子的位置/指示T中双亲的位置/指示编码在cd中的位置for (i = 0; i = 0)cdstart = (Tp.lchildc = p;strcpy(Hi.bits, &cdstart);/依次求叶子的编码/读入叶子Ti对应的字符/编码起始位置的初值/从叶子Ti开始回溯/直到回溯到Tc是树根位置 =c) ? 0 : 1;/复制临时编码到编码表中/临时存放编码/*对文件编码,将结果保存到codefile.txt中*/void EncodingHuffmanT(HuffmanT T, HuffmanCode H)

14、char c;FILE *in,*fp;int j,l;char encodefile20,tempMAX_C;if (in = fopen(filename, rb) = NULL) printf(Read %s fail!n”, encodefile); exit(1);CharSetHuffmanEncoding(T, H);printf(Input encode file name:);gets( encodefile );if (fp = fopen(encodefile, wb) = NULL) printf(Write %s fail!n, encodefile); exit(1

15、);fread(&c, sizeof(unsigned char), 1, in);fwrite(&c, sizeof(unsigned char), 1, fp);fseek(in, 0, SEEK_SET);fseek(fp, 0, SEEK_SET);while ( 1 )/( !feof( in ) fread(&c, sizeof(unsigned char), 1, in); if (feof(in) break;for (j = 0; j n; j+)if (c = Hj.ch) l = 0;while (Hj.bitsl != 0) templ = Hj.bitsl;l+;in

16、t m = 0;while ( l)fwrite(&tempm+, sizeof(unsigned char), 1, fp);fclose(fp);printf(Encoding file has saved into %s!n, encodefile);/*根据Huffman编码、译码*/void DecodingHuffmanT(HuffmanT T, HuffmanCode H) int i;/指示 Huffman tree 叶子个数FILE *fp,*fp1;char ch,ch120,ch220;printf(Input encode file name:);scanf(%s”,

17、chi);printf(Input decode file name:);scanf(%s”, ch2);fp = fopen(ch1, rb);fpl = fopen(ch2, wb);/根据Huffman树对Huffman编码 译码i = 2 * n - 2;fseek(fp, 0L, SEEK_SET);fseek(fp1, 0L, SEEK_SET);while (!feof(fp) fread(&ch, sizeof(unsigned char), 1, fp);if (ch = 0)/若编码为。,则找此结点的左子树;i = Ti.lchild;if (ch = 1)/若编码为1,则

18、找此结点的右子树;i = Ti.rchild;if (i n) fwrite(&Ti.ch, sizeof(unsigned char), 1, fp1);i = 2 * n - 2;fclose(fp);fclose(fp1);printf(Decoding accomplished!nThe result has save input %s.n”,ch2); getchar();/*打印Huffman权重表*/void PrintHuffmanT(HuffmanT T) int i;FILE *fp;if (fp = fopen(treeprint.txt”, wb) = NULL) pr

19、intf(Open treeprint.txt fail!n);exit(1);printf(nLeaf&weight of the Huffman tree is below:n);for (i = 0; i 0) printf(n);if (Ti.weight 0) fprintf(fp, %c:%d , Ti.ch, Ti.weight);printf(%c: %d , Ti.ch, Ti.weight);fclose(fp);printf(nLeaf&weight of the Huffman tree saved in treeprint.txtnn);/*打印Huffman编码表*

20、/void PrintHuffmanH(HuffmanT T, HuffmanCode H) int i;FILE *fp;CharSetHuffmanEncoding(T, H);if (fp = fopen(codeprint.txt”, wb) = NULL) printf(Open codeprint.txt fail!n);exit(1);for (i = 0; i 0) printf(n);printf(%c: %sn, Ti.ch, Hi.bits);fprintf(fp, %c:%s , Ti.ch, Hi.bits);fclose(fp);printf(nHuffman tr

21、ee code saved in codeprint.txt!nn);/*主菜单*/void MainMenue() fflush( stdin );printf(n* Main Menue *n);printf(*n);printf(*1. Load to be dealt file.*n);printf(*2. Show Huffman code list.*n)printf(*3. Show Huffman weight list.*n)printf(*4. Encoding Huffman file.*n)printf(*5. Decoding Huffman file.*n)prin

22、tf(*6. Exit.*n)printf(*n)t-x -i s 1- t s iprintfi*n/*主函数开始*/int main()int flag = 1; char ch10; HuffmanT T;HuffmanCode H; InitHT(T);while ( flag ) /定义Huffman树/定义Huffman编码表/初始化Huffman树MainMenue();printf(Please input your choice(16):);gets( ch );switch (ch0)case 1CreatHT(T);break;case 2PrintHuffmanH(T,

23、 H);break;case 3PrintHuffmanT(T);break;case 4EncodingHuffmanT(T, H);break;case 5DecodingHuffmanT(T, H);break;case 6exit(1);default:printf(Input error!n);break;return 0;八、测试情况程序的测试结果如下:| decodefile.txt -记事本文件(F)蠲(E)瑚(。)鬲(V)落- 一In its medical literature, the Food and Drug Adinini strati on states that

24、 hot water coinfortable enoughfor washing hands is not hot enough to kill bacteria, but is more effective than cold water because itremoves oils from the hand that can harbor bacteria.:1011010:1011011:110,:1011101.:101111000: 10111101D: 10111110F: 10111111I: 11100100a: 1001b: 00010c: 10101d: 01111e:

25、 1111f: 111000g: 101100h: 0110i: 0100M 1H001011: 10100m: 00011n: 0000o: 1000r: 0101S: 11101t: 001U: 01110L 1011100W: 1110011Huffman tree code saued in codeprint.txt?建立哈夫曼树、打印编码表正确。Leaf8tweight of theHuffman tree is below::2: 40r -2.:1A: 1 D: 1F:1 I:1 a:21b: 6 c: 8d:7 e:21 f:54 h: 13i15k: 11: 7n: 6 n

26、: 1218r: 15s : 11t: 26 u: 6u2w: 3Leaf&weightoftheHuffmantree saved in treeprint.txt打印权重表正确。Matin Menue *otxxoo(oo(oooooooooo TOC o 1-5 h z 1. Load to be dealt f ile.*J*2.Show Huffmancode list.*J*3.Show Huffmanweight list.*4. Encoding Huffman file.*J*5.Decoding Huffman file.*k*6.Exit.*mt-Please input

27、 90ur choice”: 4Input encode file name: encodefile.txtEncoding file has saved into encodefile.txtf* Main Menue TOC o 1-5 h z J*1.Load to be dealt f ile.*J*2.Show Huffman code list.*X3.Show Huffman weight list.*4. Encoding Huffman file.*J*5.Decoding Huffman file.*k*6.Exit.*M-M*Please input pour choic

28、e: 5 Input encode file name: encodefile.txt Input decode file name: decodefile.txt Decoding accomplished?The result has saue input decodefile.txt.二encodefile.txt -记事本莫件(F) 好(E)悟式(吃查看(V)竟氤而111001000000110010000111101110000111111011110100101011001101001101010001 000011111010110010010111001011111101110

29、111000101101111110101111111000100001111110100100000111111010111110010101110101100110101111010111100011010000000100111010010101100100101001000000011011101001100100111111110111000101101001001110011010000011101110011100100111110101110101011000000111110001000010100110010001010100111111011110000100001110

30、101100011010110111011010111000100001011101110011100111101011001000000101100110011010010000011111110111001001110111000001000001110011010000011101111000010000111010110001101100011000110111001010100101001010011000010100110101001111101010100100110111011100001001110001110010011101110000111000010111111101

31、111111000111000111110101001010010111001111110001011010010000110101011000101000111111011100111001001111101011100001011111010110010111011101111111001000011011011101101001011111000111000101110011111110111010000100101001110111011100001011000000111100010110111111001101001000001111110001011010010011101010110010000110011010010101000101000010111000010100110101001111101010100100110111100哈夫曼编码正确。| decodefile.txt -记事本I 口 回In its medical literature, the Food and Drug Administra

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论