




已阅读5页,还剩8页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
General Survey on Massive Data EncryptionMengmeng Wang!, Guiliang Zhu!Xiaoqiang Zhang2North China University of Water Resources and ElectricState Key Lab of Software Development EnvironmentPowerBeihang UniversityZhengzhou,ChinaBeijing,ChinaE-mail: E-mail: Abstract-With the rapid development of Cloud computing, Internet of Things and social network technologies, the network data are increasing dramatically. The security of massive data interaction has attracted more and more attention in recently years. This paper discusses the encryption principles, advantages and disadvantages of some mainstream massive data encryption technologies, i.e., the encryption technology based on modern cryptosystem, the encryption technology based on parallel and distributed computing, the encryption technology based on biological engineering, and Attribute-based massive data encryption technology. Finally, we outlook the need-to-be solved problems and development trend of the massive data encryption technology.Keywords-massive data encryption, modern cryptosystem, distributed computing, biological engineering, Attribute-based encryption1 INTRODUCTONWith the rapid development of cloud computing,Internet of Things and social network technologies, the network data are increasing dramatically and tend to be intensive and large-scale. Meanwhile, the ever-increasing amount of digital content that is the result of more powerful machine storage and networking is intensifying the demand for information and for making sense of activities, preferences, and trends. The security of massive data is such a daunting task that it can only be carried out on a network of machines.Recently, Amazon and Google have burst a variety of massive data security incidents under the cloud computing environments. For example, Amazon Simple Storage Service (Amazon S3) interrupted twice respectively in February and July of 2009, which led websites that relying on a single network storage services to paralysis. Meanwhile, pervasive networking has become available and has led to the distribution and sharing of data and, consequently, distributed communication, creation, consumption, and collaboration. More and more users are worried about the security of massive data interaction in the network environment. The third party usually damages these data by the operations of intercepting malicious tampering,unauthorized copying or distribution. The main security requirements of massive data are usually as follows:(1) Confidentiality: Prohibit the third party from illegally obtaining the plain text information;(2) Integrity: Protect massive data from being destructed, lost or tampered during the processes of storing and 随着云计算的发展,物联网技术和社交网络,网络数据急剧增长。数据交换的安全在最近今年越来越受人们关注。这篇论文论述加密原理,一些主流大量数据加密的优缺点,加密技术依赖于现代的加密系统,加密技术基于并行和分布式计算,加密技术基于生物工程, 和基于属性的大规模数据加密技术。最后,我们需要解决的前景问题和发展趋势,大规模数据加密技术。关键词大量数据加密, 现代加密机制,分布式计算、生物工程、基于属性的加密。 1 介绍随着云计算的发展,物联网技术和社交网络,网络数据急剧增长,往往是密集和大规模的,与此同时,数字内容的不断增加,需要更强大的机器存储和要求网络弄清它的活动范围,偏向,动态。大量的数据安全是一个让人气馁的工作因为只能依靠网络装置。 最近亚马逊和谷歌爆发了大量在云计算下的数据安全事件,例如,亚马逊存储服务器分别在2009年二月六月中断,这导致网站,依靠单一服务器的网络瘫痪,同时无处不在的网络已经成为可用,还导致了分配和共享数,因此有了分布式的交流,创造,消费,合作。越来越多的用户担心大量的数据交互的安全的网络环境,第三方通常损害这些数据通过拦截恶意篡改、非法复制或分发的操作。主要的数据安全通常需要如下 1机密性:阻止第三方通过非法获得明文信息 2。完整性:保护大量的数据被破坏、丢失或干扰过程中存储和转移信息transferring information; (3) Copyright protection: To identify the authenticity of data with effective methods, and provide effective legal evidence; (4) Encryption efficiency: Encryption measures must be suitable for the processing of massive data 1-3. There are usually two main technologies for massive data encryption. One way is encryption technology integrated into hardware equipments, for example, the common encryption card, private networks, and dedicated encryption machines. However, this way is usually used in the universal demand environment, the security measures are the same for different business logic and distribution in different geographical regions. For example, the uniform security measures include IP Sec VPN, SSL VPN cryptographic equipment and the full disk encryption equipment. The other way is just to provide cryptographic services. This way can complete the relatively personalized data encryption and decryption functions for specific applications, such as file encryption and confidential source encryption 4. In addition to the confidential measures, the security of massive data should also consider the self-recovery capability, computing speed and some other factors of a single communications and computing behavior in the damaged case. The simple communication model of massive data encryption system IS shown in Figure I. Massive data encryption system can be expressed as a five-topple: .( I) P denotes the plain text set;(2) C denotes the cipher text set; (3) K denotes the key set; (4) E denotes the encryption function: Using E and key (k E K ) to encrypt a plain text p (p E P) we can obtain the cipher text c ( c E C ), ., E: Px K C , short for EK (P) = C ;(5) D denotes the decryption function: With D and Key k (k E K ) to decrypt c , we can recover the plain text p , D: Cx K P, short for DK(C) = P.The applications of massive data have widely developed both in depth and breadth, and there are some initial fruits in massive data encryption technologies, which can be divided into four areas: massive data encryption technology based on modem cryptosystem, massive data encryption technology based on 3版权保护:确定数据的真实性与有效的方法,并提供有效的法律证据;4加密效率,加密方法必须适合1-3通常有两个重要的技术用于大量数据加密,一种方法是加密技术集成到硬件设备,例如常见的加密卡,私人网络和专用的加密机。然而,这种方法通常用在普遍需求的环境, 安全措施是相同的,不同的业务逻辑,分布在不同的地理区域。统一的安全措施包括IP Sec VPN,SSL VPN加密设备和完整的磁盘加密设备。其他的方法仅只是是提供加密服务。这种方式可以完成相对个性化的数据加密和解密功能为特定应用程序, 如文件加密和加密机密来源4,除了考虑保密措施,也该考虑大规模数据的安全自动恢复能力,计算速度和一些其他因素单一的通信和计算行为在受损的情况。简单的通信模型的大规模数据加密系统如图1大规模的数据加密系统可以表示为一五推翻:。1P表示明文2C表示密文3K表示密钥4 E代表加密功能:使用E和关键(k E k)来加密一个纯文本p(p E p)我们可以得到密文c(c E c),即。,E:Px K C,简称EK(P)= C5 D表示解密函数:用D和关键k(k E k)解密,我们可以恢复c p,即纯文本。D:K P,简称DK(C)= P。大量数据的应用具有广泛的发展无论在深度和广度,和有一些初步的成果在大规模数据加密技术,可以分为四个方面: 大规模的数据加密技术基于现代密码系统,大规模的数据加密技术的基础上并行和分布式计算、海量数据加密技术parallel and distributed computing, massive data encryption technology based on biological engineering and Attribute-based massive data encryption technology. This paper discusses the encryption principles, advantages and disadvantages of these technologies. Finally, we outlook the need-to-be solved problems and development trend of massive data encryption technology.II. MASSIVE DATA ENCRYPTION BASED ON MODERN CRYPTOSYSTEMA. Encryption Principle In 1949, Shannon published a paper entitled Communication theory of secrecy systems. After that the theory of modem cryptosystem is gradually established. According to the key characteristics, modem cryptosystes can be classified into symmetric cryptosystems and asymmetric cryptosystems. For a symmetric cryptosystem, the sender and receiver share an encryption key and a decryption key. These two keys are the same or easy to deduce each other. The representatives of symmetric cryptosystems are DES (D ta Encryption Standard) and AES (Advanced Encryp IOn Standard). For an asymmetric cryptosystem, the receiver possesses a public key and a private key. The public key can be published, but the private key should be kept secret. The representatives of asymmetric cryptosystems are RSA (Rivest, Shamir Adleman) and ECC (Elliptic Curve Cryptosystem). Considering the difference of encryption speed, the symmetric cryptosystem always encrypt a large quantity of text data, and the asymmetric cryptosystem always encrypt the short massages, such as keys. The principle of massive data encryption based on modem cryptosystem is converting massive data into binary stream, and then encrypting this binary stream with modem use of modem cryptosystems.First, convert the plain text into small data blocks; second, encrypt these data blocks with the selected modem cryptosystem; finally, convert the encrypted data blocks into cipher text.(1) DESNational institute of standards and technology (NIST) recruited cryptosystems all over the world on May 15, 1973. DES was proposed in this activity 5. NIST approved DES as the data encryption standard of USA government in 1981. DES is a block cipher algorithm, whose block length is 64 bits and the key length is 56 bits.(2) AES基于生物工程技术和基于属性的大规模数据加密技术,本文论述了加密原理,这些技术的优缺点,最后,我们需要解决的前景问题和发展趋势,大规模的数据加密技术。2基于现代密码系统的大规模数据加密A加密原则在1949年香农发表了一篇名为“保密系统的通信理论” 之后,现代密码体制的理论被逐渐建立起来, 根据关键特征, 现代加密系统可以分为对称密码和非对称密码系统。发送方和接收方共享一个加密密钥和解密密钥。这两个密钥是相同的或容易推断出对方。对称密码系统的代表是DES(数据加密标准)和AES(先进的加密标准) 。对于一个非对称密码系统,接收者具有一个公钥和一个私钥。公钥可以公开,但私钥是应该保密的,非对称密码系统的代表是RSA(李维斯特,沙米尔艾德曼)和ECC(椭圆曲线密码体制)。考虑到不同的加密速度, 对称加密体制加密大量的文本数据,而非对称密码体制总是加密短的信息列如关键字。大规模数据加密的原理基于现代密码系统是将大量的数据转换成二进制流,然后应用现代加密系统对这个二进制流进行加密。首先,将明文分成小数据块;第二对这些数据用现代加密系统选择;最后,将加密后的数据块压缩成密文。(1) DES国家标准与技术研究院(NIST)在1973年5月15日招募了世界各地的密码。DES提出了活动5美国政府在1981批准DES数据加密标准。DES块长度为64位和密钥长度为56位的加密算法。(2) AESBecause of the short key, DES cannot satisfY the security requirements in practice 6. Therefore, NIST recruited. the advanced encryption standard all over the world on Apnl 5, 1997. NIST declares Rijndael algorithm as AES in October 2000,which is instead of DES on November 26,2001. AES is a block cipher algorithm, whose block length can be 128, 192 or 256 bits.(3) RSAIn 1978, Rivest, Shamir and Adleman proposed RSA cryptosystem 7. Its advantages are simple encryption principle and easy realization. However,with the improvement of the integer factorization algorithm and computing capabilIty of computers,we should continuously extend the key length of RSA to ensure the security of interactive information. Herman te Riele et aI, successfully broke down 512 bits-RSA with the method of number field sieve on August 22, 1999. Therefore, experts suggest using 1024 bits-RSA to ensure the 1O.-ye r security, and 2048 bits-RSA to ensure the 20-year secunty m practice.(4) ECCIn 1985, Koblitz and Miller proposed ECC based on the difficulty of elliptic curve discrete logarithm problem (ECDLP) 8,9. The research indicates that the security of 160 bits-ECC /210 bits-ECC equals to the security of 1024 bits- RSA / 2048 bits-RSA. The advantages of ECC are high security, low temporal complexity, short key length, etc., so E C is a promising asymmetric cryptosystem. As for massive data encryption,ECC is usually used to encrypt the key.B. Features Massive data can be encrypted by modem cryptosystem in theory. However, modem cryptosystems are designed to encrypt text data without combining with the characteristics of massive data, which could hardly satisfy the practical requirements. Massive data is a special kind of data, which possesses characteristics of large quantity of data, high dimensional data and high redundancy, etc. The modem cryptosystem has the complicated structure and large amount of calculation,so it is not suitable for massive data encryption. The specific reasons are as follows:(1) Large Quantity of DataInternet information is rapidly increasing in the information explosion era. According to the report of China Internet Network Information Center (CNNIC) released in January of 2010 the number of Chinese web pages is more than 33 billi n by the end of 2009,with the 100% growth comparing with that of 2008. The total web bytes of Chinese web pages have reached to 520 TB. The scale of the network information has expanded sharply and by the end of July of 2010, the image number of global Internet frames is over 10 billion,which is counted by Google search. Meanwhile, the number of web pages indexed by Google search every second is enduringly increasing. In massive data era, TB level even PB level data require the support of large scale parallel 由于密钥短,DES不能满足安全要求在实践中6。因此,NIST在1997-4-5征集了更先进的加密标准。在2000-10宣布了AES密钥算法直到2001-11-26被DES代替。AES是分块加密算法的块长度可达128,192或256位。(3) RSA1978年,李维斯特,Shamir和RSA密码体制提出艾德曼7。它的优点是加密原理简单和容易实现。然而随着改进算法的整数分解和计算机的计算能力的改进,我们应该不断扩展密钥长度的RSA以确保安全的信息交互。赫尔曼等人,在1999-8-22成功突破了512位RSA法与数域筛,因此,专家建议在实践中使用1024位RSA以确保10年的安全,和2048位RSA以确保20年的安全。(4) ECC1985年,Koblitz和米勒提出了ECC基于椭圆曲线离散对数问题(的难题ECDLP)(8、9)。研究表明160位ECC 210位ECC的等于安全的1024位-RSA 2048位RSA。ECC的优点是安全性高、时间短复杂度小,密钥长度等,所以ECC是一种很有前途的非对称密码系统。对于大规模数据加密,ECC通常用于加密密钥。B 特点大量的数据可以通过现代密码体制加密在理论。然而,现代密码系统被设计用来加密文本数据没有结合大量数据的特点,很难满足实际需求。大量的数据是一种特殊的数据,具有数据量大的特点,高维度数据和高冗余等。现代密码系统具有复杂的结构和大量的计算,所以它不适合大规模数据加密。具体原因如下:1 大量的数据网络信息迅速增加在信息爆炸时代。根据这份报告,中国互联网络信息中心(CNNIC)在2010年1月发布的中国网页的数量超过33千兆,到2009年底,对2008年的比较是增长100%。总网络字节网页已达到520TB。网络的规模急剧扩大的信息,2010年7月底,图像数量的全球互联网框架已经100亿。computational network and the cost of huge storage capacity, which almost make common encryption algorithms overwhelmed.(2) High Dimensional DataMassive information in the Internet is lack of effective management. As a result, its difficult to find and manage valuable information. Because modem cryptosystems is usually designed to encrypt one-dimensional data, the high dimensional massive data should be pre-processed.(3) High RedundancyMassive data always have the characteristic of high redundancy. In practice,Internet information is filled with a lot of duplicate information or false information. This problem cant be solved until the massive information could be analyzed and mined deeply. Therefore, massive data encryption often allows a certain degree of distortion as long as it is not beyond the scope of usage of people.C. Summary In theory, the massive data can be encrypted by modem cryptosystem. However, it is a special type of data, and the modem cryptosystem is designed for the text data without combining with the characteristics of massive data. Therefore, modem cryptosystem, with complicated structure, large amount of calculation and low encryption efficiency, to a certain extent, is not suitable for massive data encryption. It should be combined with some other means to improve the algorithm speed,security,etc.III. MASSIVE DATA ENCRYPTION BASED ON PARALLEL AND DISTRIBUTED COMPUTINGA. Encryption Principle In the past two decades, the performance of computer has been greatly improved year by year, especially the CPU, memory and other hardware equipments. However, the development of the technology in hardware equipments is limited in theory. The performance of computer system is improved with the development of hardware equipments in the longitudinal direction, but the development of parallel technology makes a great contribution to improving the performance of computer dealing with things in the horizontal way. Parallel processing is an essential means to analyze and process massive data. The strategy of encrypting massive data usually is the so-called to divide and rule, so there is no boundary for the performance extension in theory 10.Researchers at Peking University have combine IBE (Identity Based Encryption) with CPK (Combined Public Key) to provide security supports for the ultra-large-scale cloud computing users. The features of this way are large scale, high efficiency, low bandwidth, easy using, etc. 11-15. As the emergence of data 与此同时,网页的数量由谷歌索引搜索每秒都在持久地增加。在大规模数据加密时代,TB水平PB水平数据要求甚至支持大规模并行计算的网络和巨大的存储容量,这几乎使常见的加密算法难以实现。2高维数数据大量的信息在互联网中缺乏有效的管理。因此,很难找到和管理有价值的信息。因为现代密码系统通常是用来加密一维数据、高维大规模数据应该预处理。3高冗余大规模的数据总是有高冗余的特点。在实际网络中,网络信息充斥着许多重复的信息或虚假信息。这个问题不能解决,直到大规模信息可以分析和深入挖掘。因此,大规模的数据加密通常允许一定程度的失真只要它不是超出了人们使用的范围。C 总结理论上,大规模的数据可以通过现代密码体制加密。然而,它是一种特殊类型的数据,和现代密码系统是专为文本数据没有结合大量数据的特点。因此,现代密码系统,结构复杂,计算量大,加密效率低,在一定程度上,是不适合大规模数据加密。它应该结合一些其他手段来提高算法的速度,安全,等等。3 基于并行和分布式计算的大规模数据加密 A 加密原则在过去的二十年里,计算机的性能每年都有大大改善,尤其是CPU、内存和其他硬件设备。然而,硬件的迅速发受限的理论上。计算机系统的计算性能在纵向随着硬件的发展而发展。,但开发并行技术为提高计算性能在水平方向作了重大贡献。并行处理是用来分析和处理大量的数据重要手段。大量数据的加密策略通常是所谓的“分而治之”,所以在理论上可以无限制的扩大性能 10。intensive applications, massive data has received more and more attentions of researchers. It is a challenge to eliminate duplication for massive data in a shared nothing environment. Song Huaiming et al. published an article entitled Duplication elimination in large scale data intensive systems. This paper proposes an effective and adaptive method of data placement, which is a hybrid of hash partition and histogram. This paper also designs an asynchronous parallel query engine (APQE) for duplication elimination, which provides an efficient algorithm for the massive data encryption based on parallel and distributed computing 16. Han Xixian et al. have published an article named TKEP: an efficient top-k query processing algorithm on massive data. In this paper, they perform the early pruning in increasing phase to prune most of candidate topples, which provides an important method for the massive data encryption based on parallel and distributed computing 17.B. Features With the development of parallel computing technology, it is gradually recognized that parallel processing in space or time can greatly enhance the processing efficiency of task. Parallel computing can be divide into task parallelism and data parallelism. The former would make the coordination and management of tasks very complex. However, the data parallelism divides a big task into several identical sub-tasks, and then it is much easier to handle these sub-tasks 18-19. Distributed computing is solving a big and complex problem by giving small parts of the problem to many computers to solve and then combining the solutions for the parts into a solution for the problem, and it has an insatiable appetite for computing power 20). Cloud computing is an Internet-based and typical distributed computing, which makes the free flow of super-computing capabilities through the Internet become possible, and the huge hierarchical resource pools are connected togeth
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 新解读《GB-T 32552-2016无缝和焊接钢管(埋弧焊除外)的自动全圆周超声厚度检测》
- 乡下住房产权合同范本4篇
- 专业版办营业执照租房合同5篇
- 新解读《GB-T 31055-2014谷糙分离筛板》
- 新解读《GB-T 31207-2014机械产品再制造质量管理要求》
- 租房入学合同范本
- 汽修类员工合同范本
- 合作门窗项目合同范本
- 安全知识测试题(含答案)
- 合同签署中需要注意的法律问题
- 低温杜瓦瓶安全操作规程(4篇)
- 2024新苏教版一年级数学上册全册教案(共21课时)
- 《交通运输行业安全生产监督检查工作指南 第2部分:道路运输》
- 物业费收缴培训
- 2024版风力发电站智能运维与远程监控合同3篇
- 操作系统原理 习题及答案(机工孟庆昌第2版)
- 军用无人机课件
- 303智能化综采工作面作业规程
- 中建基础设施公司“主要领导讲质量”
- 山东省二年级下册数学期末考试试卷
- DBJ46-070-2024 海南省民用建筑外门窗工程技术标准
评论
0/150
提交评论