版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、A,1,数据采集与清洗,2019|02|15,周 乐,A,2,什么是大数据,大数据处理流程,大数据的主要特征,大数据采集的概念,大数据采集应用,A,3,1,什么是大数据,A,4,A,5,A,6,A,7,淘宝推荐,依据购物行为偏好引荐,依据你最近的阅读行为和消费行为进行引荐,依据你用的设备往来不断猜特征.,依据时节改变进行引荐,A,8,2014-03,2015-08,2017-10,2016-03,2018,大数据工作首先写入政府工作报告,十三五规划纲要提出实施国家大数据战略 ,2018年政府工作报告提出:实施大数据发展行动,注重用互联网、大数据等提升监管效能,国务院发布促进大数据发展的行动纲要
2、,十九大提出推动大数据战略,与实体经济深度融合,行业现状与前景,A,9,A,10,2019年人社部拟最新发布15项新职业,1.大数据工程技术人员 2.云计算工程技术人员 3.人工智能工程技术人员 4.物联网工程技术人员 5.,A,11,A,12,A,13,什么是大数据,大数据(Big Data)是指无法使用传统和常用的软件技术和工具在一定时间内完成获取、管理和处理的数据集,A,14,大数据的主要特征,A,15,大数据主要特征,Volume,Velocity,Variety,Veracity,真实性(Veracity),即追求高质量的数据。,容量大(Volume),指大规模的数据量,并且数据量呈
3、持续增长趋势。,速度快(Velocity),指的是数据被创建和移动的速度。,种类多(Variety),指数据来自多种数据源,数据种类和格式。,Value,价值密度低(Value),指随着数据量的增长,数据中有意义的信息却没有成相应比例增长。,A,16,3,大数据处理流程,A,17,大数据处理流程,数据预处理 就是将采集来的数据从多种数据库导入到大型的分布式数据库中(目前主要是hfds或hive),并同时做一些简单的清洗和预处理工作。,数据统计分析 就是对上面已经完成的存储在大型分布式数据库中的数据进行归类统计,可以满足一般场景的分析需求。,数据挖掘 是对数据进行基于各种算法的分析计算,从而起到
4、预测的效果,实现一些高级别数据分析的需求。,数据采集 就是利用多种数据库(关系型,NOSQL)去存储不同来源的数据。,数据展示 就是对以上处理完的结果进行分析,或者形成报表。,A,18,大数据采集的概念,A,19,大数据采集的概念,3、大数据采集技术方法 大数据采集技术就是对数据进行 ETL 操作,通过对数据进行提取、转换、加载,最终挖掘数据的潜在价值。ETL指的是Extract-Transform-Load,也就是抽取、转换、加载。 抽取-从各种数据源获取数据 转换-按需求格式将源数据转换为目标数据 加载-把目标数据加载到数据仓库中,2、数据采集与大数据采集的区别 传统数据采集:来源单一,数
5、据量相当小;结构单一;关系数据库和并行数据库 大数据的数据采集:来源广泛,数量巨大;数据类型丰富;分布式数据库,1、什么是数据采集 数据采集就是数据获取,数据源主要分为线上数据和内容数据,A,20,大数据采集系统,1.日志采集系统(Apache Flume、Scribe),3.数据库采集系统(关系型、nosql等各种数据库),2.网络数据采集系统(Scrapy 框架、Apache Nutch),A,21,5,大数据采集应用,A,22,A,23,技能准备,Python基础,Linux操作系统基本操作,数据库基础(SQL语句操作),A,24,环境准备,Python,Jdk(java环境),数据库(
6、mysql),A,25,Thanks,A,26,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who wi
7、ll try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,A,27,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Noth
8、ing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,A,28,OKPPT工作室,A,29,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,No
9、thing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,A,30,YOUR TITLE,Nothing is di
10、fficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will
11、 try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,A,31,YOUR TITLE,21%,9%,28%,42%,A,32,3,OKPPT工作室,A,33,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothi
12、ng is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man w
13、ho will try.Nothing is difficult to the man who will try.,A,34,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.
14、,Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.,Nothing is difficult to the man who will try.,A,35,YOUR TITLE,Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.Nothing is difficult to the man who will try.,Nothing is difficul
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025-2030中国家用理疗仪市场发展态势及供需形势分析报告
- 浮息债全景分析报告:浮息债的理论定价与现实应用
- 临床医学综合能力(西医)模拟352
- 商超安全标准化建设
- 麻纺企业物料储存细则
- 降糖药物超适应证临床应用专家共识
- AI在瑞典语中的应用
- 2026年英语听力高频考点增值化训练方案
- 25-26学年语文(统编版)选择性必修下册课件:第3单元 单元通学任务(2) 学习写一封情感真挚的书信
- 高中英语m4教学资料-被动语态
- 喀什地区2025新疆维吾尔自治区喀什地区“才聚喀什智惠丝路”人才引进644人笔试历年参考题库典型考点附带答案详解
- 2026LME与上海期货交易所价格引导关系研究
- 健康人口与社会经济协同发展策略
- 二十届四中全会模拟100题(带答案)
- 2026年苏教版二年级科学下册(全册)教学设计(附教材目录)
- 腾讯收购案例分析
- 污水厂运营夜班制度规定
- 2026年就业市场:挑战与机遇并存高校毕业生就业指导与策略
- 医疗广告审查标准与医美宣传红线
- 袖阀管注浆地基加固规范方案
- 2026年建筑智能化对电气节能的推动
评论
0/150
提交评论