深度学习课件_第1页
深度学习课件_第2页
深度学习课件_第3页
深度学习课件_第4页
深度学习课件_第5页
已阅读5页,还剩131页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、深度学习人工智能的起点: 达特茅斯会议1919-20011927-20111927-20161916-2011Nathaniel Rochester人工智能的阶段 1950s 1980s 2000s Future 自动计算机如何为计算机编程使其能够使用语言神经网络计算规模理论自我提升抽象随机性与创造性基于规则的专家系统通用智能123人工智能的当前技术: 存在的问题 依赖大量的标注数据“窄人工智能” 训练完成特定的任务不够稳定,安全不具备解释能力,模型不透明人工智能的当前状态: 应用人工智能成为热点的原因: 深度学习, 强化学习大规模的,复杂的,流式的数据概要解析白宫人工智能研发战略计划3. 深

2、度学习及最新进展2. 解析十家技术公司的的人工智能战略4. 强化学习及最新进展5. 深度学习在企业数据分析中的应用美国人工智能战略规划美国人工智能研发战略规划策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术高效的数据清洁技术以,确保用于训练系统的数据的可信性(varascty)和正确性(appropriateness) 综合考虑 数据,元数据,以及人的反馈或知识异构数据,多模态数据分析和挖掘,离散数据,连续数据,时间域数据,空间域数据,时空数据,图数据小数据挖掘,强调小概率事件的重要性数据和知识尤其

3、领域知识库的融合使用策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力硬件或算法能提升系统感知能力的稳健性和可靠性提升在复杂动态环境中对物体的检测,分类,辨别,识别能力提升传感器或算法对人的感知,以便系统更好地跟人的合作计算和传播感知系统的不确定性给系统以便更好的判断策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力当前硬件环境和算法框架下AI的理论上限学习能力

4、语言能力感知能力推理能力创造力计划,规划能力3. 理论能力和上限策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力目前的AI系统均为窄人工智能, “Narrow AI”而不是“General AI”GAI: 灵活, 多任务, 有自由意志,在多认知任务中的通用能力(学习能力, 语言能力,感知能力,推理能力,创造力,计划,规划能力迁移学习3. 理论能力和上限4. 通用AI策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技

5、术推动以数据为中心的知识发现技术2. 增强系统的感知能力多AI系统的协同分布式计划和控制技术3. 理论能力和上限4. 通用AI5. 规模化AI系统策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力AI系统的自我解释能力目前AI系统的学习方法:大数据,黑盒人的学习方法:小数据,接受正规的指导规则以及各种暗示仿人的AI系统,可以做智能助理,智能辅导3. 理论能力和上限4. 通用AI5. 规模化AI系统6. 仿人类的AI技术策略- I : 在人工智能研究领域做长期研发投资 目标:.

6、确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力提升机器人的感知能力,更智能的同复杂的物理世界交互3. 理论能力和上限4. 通用AI5. 规模化AI系统6. 仿人类的AI技术7. 研发实用,可靠,易用的机器人策略- I : 在人工智能研究领域做长期研发投资 目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术推动以数据为中心的知识发现技术2. 增强系统的感知能力提升机器人的感知能力,更智能的同复杂的物理世界交互 GPU:提升的内存,输入输出,时钟 速度,并行能力,节能“类神经元”处理器处理基于流式,动态数据利用AI技术提升

7、硬件能力:高性能计算,优化能源消耗,增强计算性能,自我智能配置,优化数据在多核处理器和内存直接移动3. 理论能力和上限4. 通用AI5. 规模化AI系统6. 仿人类的AI技术7. 研发实用,可靠,易用的机器人8. AI和硬件的相互推动策略-II: 开发有效的人机合作方法. 不是替代人,而是跟人合作,强调人和AI系统之间的互补作用辅助人类的人工智能技术AI系统的设计很多是为人所用复制人类计算,决策,认知策略-II: 开发有效的人机合作方法. 不是替代人,而是跟人合作,强调人和AI系统之间的互补作用辅助人类的人工智能技术2. 开发增强人类的技术稳态设备穿戴设备植入设备辅助数据理解策略-II: 开发

8、有效的人机合作方法. 不是替代人,而是跟人合作,强调人和AI系统之间的互补作用辅助人类的人工智能技术2. 开发增强人类的技术数据和信息的可视化,以人可以理解的方式展现提升人和系统通信的效率3. 可视化, AI-人之间的友好界面策略-II: 开发有效的人机合作方法. 不是替代人,而是跟人合作,强调人和AI系统之间的互补作用辅助人类的人工智能技术2. 开发增强人类的技术已成功:安静环境下的流畅的语音识未解决的:噪声环境下的识别,远场语音识别,口音,儿童语音识别,受损语音识别,语言理解,对话能力3. 可视化, AI-人之间的友好界面4. 研发更有效的语言处理系统策略 III: 理解并重点关注人工智能

9、可能带来的伦理, 法律, 社会方面的影响研究人工智能技术可能带来的伦理, 法律,社会方面的影响期待其符合人的类规范AI系统从设计上需要符合人类的道德标准:公平,正义,透明,责任感策略 III: 理解并重点关注人工智能可能带来的伦理, 法律, 社会方面的影响研究人工智能技术可能带来的伦理, 法律,社会方面的影响期待其符合人的类规范AI系统从设计上需要符合人类的道德标准:公平,正义,透明,责任感2. 构建符合道德的AI技术如何将道德量化, 由模糊变为精确的系统和算法设计道德通常是模糊的,随文化, 宗教和信仰而不同策略 III: 理解并重点关注人工智能可能带来的伦理, 法律, 社会方面的影响研究人工

10、智能技术可能带来的伦理, 法律,社会方面的影响期待其符合人的类规范AI系统从设计上需要符合人类的道德标准:公平,正义,透明,责任感2. 构建符合道德的AI技术两层架构: 由一层专门负责道德建设道德标准植入每一个工程AI步骤3. 符合道德标准的AI技术的实现框架策略 - IV: 确保人工智能系统的自身和对周围环境安全性在人工智能系统广泛使用之前,必须确保系统的安全性研究创造稳定, 可依靠,可信赖,可理解,可控制的人工智能系统所面临的挑战及解决办法提升AI系统 的可解释性和透明度2. 建立信任3. 增强verification 和 validation4. 自我监控,自我诊断,自我修正5. 意外处

11、理能力, 防攻击能力策略-V: 发展人工智能技术所需的共享的数据集和共享的模拟环境一件重要的公益事业, 同时要充分尊重企业和个人在数据中的权利和利益鼓励开源策略-VI: 评价和评测人工智能技术的标准开发恰当的评级策略和方法策略- VII: 更好的理解国家在人工智能研发方面的人力需求保证足够的人才资源大数据和人工智能 数据是人工智能的来源 大数据并行计算,流计算等技术是人工智能能实用化的保障 人工智能是大数据, 尤其复杂数据分析的主要方法. Top 10 家技术公司的布局Google: AI-First StrategyGoogle 化4亿美金购买英国伦敦大学人工智能创业公司:DeepMindA

12、lphaGoGNCWaveNetQ-Learning2011年成立1. 语音识别,合成 ; 2. 机器翻译;3. 无人驾驶车. 4. 谷歌眼镜. 5. Google Now. 6. 收购 Api.uiFacebook共享深度学习开源代码:TorchFacbook M 数字助理研究和应用:FAIR & AML Apple AIApple SiriApple bought Emotient and Vocal IQ?Partnership on AIIt will “conduct research, recommend best practices, and publish research u

13、nder an open license in areas such as ethics, fairness and inclusivity; transparency, privacy, and interoperability; collaboration between people and AI systems; and the trustworthiness, reliability and robustness of the technology” 2016年9月29日Elon Musk : OpenAIPaypal, Telsla, SpaceX , SolarCity 四家公司

14、CEO , 投资十个亿美金成立OpenAIMicrosoftIBM百度国内技术巨头腾讯, 阿里, 讯飞在人工智能领域投入巨大5. 深度学习在企业数据分析中的案例An example: AI in Data Analytics with Deep Learning 客户情感分析Introduction Emotion Recognition in TextEmotion Recognition in SpeechEmotion Recognition in Conversations Industrial Application DatasetsFeaturesMethodsIntroducti

15、on: Interchangeable Terms Opinion MiningSentimental AnalysisEmotion RecognitionPolarityDetectionReviewMining42Introduction: What emotions are?43Introduction: Problem DefinitionWe will only focus on document level sentimentOpinion MiningRANLP 2015, Hissar, BulgariaIntroduction: Text Examples6th Septe

16、mber 2015a thriller without a lot of thrillsAn edgy thriller that delivers a surprising punchA flawed but engrossing thrillerIts unlikely well see a better thriller this yearAn erotic thriller thats neither too erotic nor very thrilling eitherEmotions are expressed artistically with help of Negation

17、 Conjunction Words Sentimental Words, e.g. 45RANLP 2015, Hissar, BulgariaIntroduction: Text ExamplesDSE: explicitly express an opinion holders attitudeESE: indirectly express the attitude of the writer6th September 2015Emotions are expressed explicitly and indirectly. 46RANLP 2015, Hissar, BulgariaI

18、ntroduction: Text Examples6th September 2015Emotions are expressed language that is often obscured by sarcasm, ambiguity, and plays on words, all of which could be very misleading for both humans and computers A sharp tongue does not mean you have a keen mind I dont know what makes you so dumb but i

19、t really works Please, keep talking. So great. I always yawn when I am interested.47RANLP 2015, Hissar, BulgariaIntroduction: Speech Conversation Examples6th September 201548RANLP 2015, Hissar, BulgariaIntroduction: Conversation Examples6th September 201549RANLP 2015, Hissar, BulgariaTypical Approac

20、h: A Classification Task6th September 2015A DocumentFeatures:Ngrams (Uni, bigrams)POS TagsTerm FrequencySyntactic Dependency Negation Tags SVMMaxentNave BayesCRFRandom ForestPosNeuNegSupervised LearningPos-Tag Patterns + Dictionary +Mutual InfoRulesUnsupervised Learning50RANLP 2015, Hissar, Bulgaria

21、Typical Approach: A Classification Task6th September 2015Features:Prosodic features:pitch, energy, formants, etc.Voice quality features: harsh, tense, breathy, etc.Spectral features: LPC, MFCC, LPCC, etc. Teager Energy Operator (TEO)-based features: TEO- FM-var, TEO-Auto-Env, etc SVMGMMHMMDBNKNN LDA

22、 CART PosNeuNegSupervised Learning51Challenges RemainText-Based: Capture the compositional effects with higher accuracyNegating Positive sentencesNegating Negative sentencesConjunction: Speech-Based:Effective features unknown. Emotional speech segments tend to be transcribed with lower ASR accuracyO

23、verviewIntroduction Emotion Recognition in TextWord Embedding for Sentiment AnalysisCNN for Sentiment ClassificationRNN, LSTM for sentiment ClassificationPrior Knowledge + CNN/LSTMParsing + RNN Emotion Recognition in SpeechEmotion Recognition in Conversations Industrial Application How deep learning

24、 can change the game?RANLP 2015, Hissar, Bulgaria6th September 2015Emotion Classification with Deep learning approaches54RANLP 2015, Hissar, Bulgaria1. Word Embedding as Features6th September 2015Representation of text is very important for performance of many real-world applications including emoti

25、on recognition: Local representations:N-gramsBag-of-words1-of-N codingContinuous Representations:Latent Semantic Analysis Latent Dirichlet Allocation Distributed Representations: word embeddingTomas Mikolov, “Learning Representations of Text using Neural Networks”, NIPs Deep learning Workshop 2013 (

26、Bengio et al., 2006; Collobert & Weston, 2008; Mnih & Hinton, 2008; Turian et al., 2010; Mikolov et al., 2013a;c)55RANLP 2015, Hissar, Bulgaria1. Word Embedding as Features6th September 2015Representation of text is very important for performance of many real-world applications including emotion rec

27、ognition: Local representations:N-gramsBag-of-words1-of-N codingContinuous Representations:Latent Semantic Analysis Latent Dirichlet Allocation Distributed Representations: word embeddingTomas Mikolov, “Learning Representations of Text using Neural Networks”, NIPs Deep learning Workshop 201356RANLP

28、2015, Hissar, BulgariaWord Embedding6th September 2015Skip-gram ArchCBOW The hidden layer vector is the word-embedding vector for w(t) 57Word Embedding for Sentiment Detection It has been widely accepted as standard features for NLP applications including sentiment analysis since 2013 Mikolov 2013Th

29、e word vector space implicitly encodes many linguistic regularities among words: semantic and syntactic Example: Google Pre-trained word vectors with 1000Billion words Does it encode polarity similarities? great0.729151bad0.719005terrific0.688912decent0.683735nice0.683609excellent0.644293fantastic0.

30、640778better0.612073solid0.580604lousy0.576420 wonderful0.572612terrible0.560204Good0.558616Top Relevant Words to “good”Mostly Yes, but it doesnt separate antonyms well RANLP 2015, Hissar, BulgariaLearning Sentiment-Specific Word Embedding6th September 2015Tang, et al, “Learning Sentiment Specific W

31、ord Embedding for Twitter Sentiment Classification”, ACL 201459RANLP 2015, Hissar, BulgariaLearning Sentiment-Specific Word Embedding6th September 2015Tang, et al, “Learning Sentiment Specific Word Embedding for Twitter Sentiment Classification”, ACL 2014In Spirit, it is similar to multi-task learni

32、ng. It learns the same way as the regular word-embedding with loss function considering both semantic context and sentiment distance to the twitter emotion symbols. 6010 million tweets selected by positive and negative emoticons as training dataThe Twitter sentiment classification track of SemEval 2

33、013Learning Sentiment-Specific Word EmbeddingTang, et al, “Learning Sentiment Specific Word Embedding for Twitter Sentiment Classification”, ACL 2014Paragraph VectorsLe and Mikolov, “Distributional Representations of Sentences and Documents, ICML 2014 Paragraph vectors are distributional vector repr

34、esentation for pieces of text, such as sentences or paragraphsThe paragraph vectors are also asked to contribute to the prediction task of the next word given many contexts sampled from the paragraph.Each paragraph corresponds to one column in DIt acts as a memory remembering what is missing from th

35、e current context , about the topic of the paragraph Paragraph Vectors Best Results on MR Data SetLe and Mikolov, “Distributional Representations of Sentences and Documents, ICML 2014 OverviewIntroduction Emotion Recognition in TextWord Embedding for Sentiment AnalysisCNN for Sentiment Classificatio

36、nRNN, LSTM for sentiment ClassificationPrior Knowledge + CNN/LSTMDataset CollectionEmotion Recognition in SpeechEmotion Recognition in Conversations Industrial Application CNN for Sentiment Classification Ref: Yoon Kim. Convolutional Neural Networks for Sentence Classification. EMNLP, 2014.CNN for S

37、entiment Classification Ref: Yoon Kim. Convolutional Neural Networks for Sentence Classification. EMNLP, 2014. A simple CNN with One Layer of convolution on top of word vectors. Motivated by CNN has been successful on many other NLP tasksInput Layer: Word vectors are from pre-trained Google-News wor

38、d2vector Conv Layer: Window size: 3 words, 4 words, 5 words. Each with 100 feature map. 300 features in the penultimate layerPooling Layer: Max Over time Pooling at the Output layer: Fully connected softmax layer , output distribution over labelsRegularization: Drop-out on the penultimate layer with

39、 a constrain on the l2 norms of the weight vectors Fine-train embedding vectors during training Common Datasets CNN for Sentiment Classification - Results CNN-rand: Randomly initialize all word embeddingsCNN-static: word2vec, keep the embeddings fixedCNN-nonstatic: Fine-tuning embedding vectorsCNN f

40、or Sentiment Classification - Results Why it is successful?Multiple filters and multiple feature mapsEmotions are expressed in segments, instead of the spanning over the whole sentence Use pre-trained word2vec vectors as input features . Embedding word vectors are further improved for non-static tra

41、ining. Antonyms are further separated after training. Resources for This work Source Code: https:/yoonkim/CNN_sentenceImplementation in Tensorflow: /dennybritz/cnn-text-classification-tftfExtensive Experiments:/pdf/1510.03820v4.pdfpdfDynamic CNN for Sentiment Kalchbrenner et al, “A Convolutional Neu

42、ral Network for Modeling Sentences”, ACL 2014 Hyper Parameters in Experiments: K=4m=5, 14 feature mapsm=7, 6 feature mapsd=48 Dynamic CNN for Sentiment Kalchbrenner et al, “A Convolutional Neural Network for Modeling Sentences”, ACL 2014 One Dimension Convolution Two Dimension Convolution48 D word v

43、ectors randomly initiated 300 D Initiated with Google word2vectorMore complicated model architecture with dynamic poolingStraight Forward 6, 4 feature maps100-128 feature mapsJohnson and Zhang. , “Effective Use of Word Order for Text Categorization with Convolutional Neural Networks”, ACL-2015Why CN

44、N is effective A simple remedy is to use word bi-grams in addition to unigramsIt has been noted that loss of word order caused by bag-of-word vectors (bow vectors) is particularly problematic on sentiment classificationComparing SVM with Tri-gram features with 1, 2,3 window filter CNNTop 100 Feature

45、sSVMCNNUni-Grams687Bi-Grams2833Tri-Grams460SVMs cant fully take advantage of high-order ngramsSentiment Classification Considering Features beyond Text with CNN ModelsTang et al. , “Learning Semantic Representations of Users and Products for Document Level Sentiment Classification“”, ACL-2015Overvie

46、wIntroduction Emotion Recognition in TextWord Embedding for Sentiment AnalysisCNN for Sentiment ClassificationRNN, LSTM for sentiment ClassificationPrior Knowledge + CNN/LSTMDataset Collection Emotion Recognition in SpeechEmotion Recognition in Conversations Industrial Application Recursive Neural T

47、ensor NetworkSocher et al. , “Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank”, EMNLP-2013 . /sentiment/The Stanford Sentiment Treeback is a corpus with fully labeled parse treesCreated to facilitate analysis of the compositional effects of sentiment in language 10,662

48、sentences from movie reviews. Parsed by stanford parser. 215,154 phrases are labeledA model called Recursive Neural Tensor Networks was proposed Recursive Neural Tensor Network- Distribution of sentiment values for N-gramsSocher et al. , “Recursive Deep Models for Semantic Compositionality over a Se

49、ntiment Treebank”, EMNLP-2013 . /sentiment/Stronger sentiment often builds up in longer phrases and the majority of the shorter phrases are neutral Recursive Neural Tensor Network (RNTN)Socher et al. , “Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank”, EMNLP-2013 . /sen

50、timent/f = tanhV is the tensor directly relate input vectors , W is the regular RNN weight matrixWang et al. , “Predicting Polarities of Tweets by Composing Word Embedding with Long Short-Term Memory”, ACL-2015LSTM for Sentiment AnalysisLSTM works tremendously well on a large number of problemsSuch

51、architectures are more capable to learn a complex composition such as negation of word vectors than simple RNNs .Input, stored information, and output are controlled by three gates.Wang et al. , “Predicting Polarities of Tweets by Composing Word Embedding with Long Short-Term Memory”, ACL-2015LSTM f

52、or Sentiment AnalysisDataset: the Stanford Twitter Sentiment corpus (STS)LSTM-TLT: Word-embedding vectors as input. TLT: Trainable Look-up Table It is observed that negations can be better captured.Tang et al. , “Document Modeling with Gated Recurrent Neural Network for Sentiment Classification”, EM

53、NLP-2015Gated Recurrent UnitTang et al. , “Document Modeling with Gated Recurrent Neural Network for Sentiment Classification”, EMNLP-2015Gated Recurrent Neural NetworkUse CNN/LSTM to generate l sentence representations from word vectors Gate Recurrent Neural Network (GRU) to encode sentence relatio

54、ns for sentiment classification GRU can viewed as variant of LSTM , with output gate always on Tang et al. , “Document Modeling with Gated Recurrent Neural Network for Sentiment Classification”, EMNLP-2015Gated Recurrent Neural NetworkJ. Wang et al., Dimensional Sentiment Analysis Using a Regional C

55、NN-LSTM Model”, ACL-2016CNN-LSTMJ. Wang et al., Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model”, ACL-2016CNN-LSTMThe dimensional approach represents emotional states as continuous numerical values in multiple dimensions such as the valence-arousal (VA) space (Russell, 1980). The dime

56、nsion of valence refers to the degree of positive and negative sentiment, whereas the dimension of arousal refers to the degree of calm and excitementK.S Tai et al, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks”, ACL-2015Tree-LSTMTree-LSTM: a generalization o

57、f LSTMs to tree-structured network topologies. TreeLSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).K.S Tai et al, Improved Semantic Rep

58、resentations From Tree-Structured Long Short-Term Memory Networks”, ACL-2015Tree-LSTMAchieve comparable accuracyConstituency-Tree based performs betterThe word vectors are initialized by Glove Word2Vectors (Trained on 840 billion tokens of Common Crawl data, /projects/glove/)OverviewIntroduction Emo

59、tion Recognition in TextWord Embedding for Sentiment AnalysisCNN for Sentiment ClassificationRNN, LSTM for sentiment ClassificationPrior Knowledge + CNN/LSTMDataset Collection Emotion Recognition in SpeechEmotion Recognition in Conversations Industrial Application RANLP 2015, Hissar, BulgariaPrior K

60、nowledge + Deep Neural Networks6th September 2015For each iteration: The teacher network is obtained by projecting the student network to a rule-regularized subspace (red dashed arrow);The student network is updated to balance between emulating the teachers output and predicting the true labels (bla

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论