




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
《大数据专业英语教程》(机械工业出版社)参考试卷命题人:张强华司爱侠参考试卷一、写出以下单词的中文意思(每小题0.5分,共10分)1accumulate11authentication2operation12malware3complexity13ransomware4filtering14vulnerability5leakage15process6engine16validity7recovery17interpretation8storage18classification9ensure19element10accumulate20executable二、根据给出的中文意思,写出英文单词(每小题0.5分,共10分)1n.元数据11n.并发(性)2n.特性;属性12n.数据库3n.服务器13adj.程序的,过程的4n.推荐引擎;推荐系统14n.仓库;贮藏室5n.标准,规格15vt.收集;采集6adj.定性的16n.聚集;集成;集结7n.登记,注册17adj.跨平台的8n.备份18vt.取得,获得;实现9n.容量;性能19n.体系结构;(总体、层次)结构10n.冗余;过多,过剩20v.担保;确保n.保证;保修单1dataflow2datamart3datamining4datasharing5datadefinition6datastorage7datavisualization8operatingsystem9semi-structureddata10sampledata四、根据给出的中文意思,写出英文短语(每小题1分,共10分)1非结构化数据2层次数据模型,分级数据模型3文本分析4数据点5数据收集6自治数据库7数据仓库8混合云9机器学习10非关系数据库五、写出以下缩略语的完整形式和中文意思(每小题1分,共10分)缩略语完整形式中文意思1AI2BDF3CMS4API5DDL6DML7DQL8ELT9JVM10SLA六、阅读短文,回答问题(每小题2分,共10分)TheImportanceofClusteringandClassificationinDataScienceThepurposeofclusteringandclassificationalgorithmsistomakesenseofandextractvaluefromlargesetsofstructuredandunstructureddata.Ifyou’reworkingwithhugevolumesofunstructureddata,itonlymakessensetotrytopartitionthedataintosomesortoflogicalgroupingsbeforeattemptingtoanalyzeit.Clusteringandclassificationallowsyoutotakeasweepingglanceofyourdataenmasse,andthenformsomelogicalstructuresbasedonwhatyoufindtherebeforegoingdeeperintothenuts-and-boltsanalysis.Intheirsimplestform,clustersaresetsofdatapointsthatsharesimilarattributes,andclusteringalgorithmsarethemethodsthatgroupthesedatapointsintodifferentclustersbasedontheirsimilarities.You’llseeclusteringalgorithmsusedfordiseaseclassificationinmedicalscience,butyou’llalsoseethemusedforcustomerclassificationinmarketingresearchandforenvironmentalhealthriskassessmentinenvironmentalengineering.Therearedifferentclusteringmethods,dependingonhowyouwantyourdatasettobedivided.Thetwomaintypesofclusteringalgorithmsare:Hierarchical:Algorithmscreateseparatesetsofnestedclusters,eachintheirownhierarchallevel.Partitional:Algorithmscreatejustasinglesetofclusters.Youmighthaveheardofclassificationandthoughtthatclassificationisthesamethingasclustering.Manypeopledo,butthisisnotthecase.Inclassification,beforeyoustart,youalreadyknowthenumberofclassesintowhichyourdatashouldbegroupedandyoualreadyknowwhatclassyouwanteachdatapointtobeassigned.Inclassification,thedatainthedatasetbeinglearnedfromislabeled.Whenyouuseclusteringalgorithms,ontheotherhand,youhavenopredefinedconceptforhowmanyclustersareappropriateforyourdata,andyourelyupontheclusteringalgorithmstosortandclusterthedatainthemostappropriateway.Withclusteringtechniques,you’relearningfromunlabeleddata.Tobetterillustratethenatureofclassification,though,takealookatTwitteranditshash-taggingsystem.Sayyoujustgotholdofyourfavoritedrinkintheentireworld:anicedcaramellattefromStarbucks.You’resohappytohaveyourdrinkthatyoudecidetotweetaboutitwithaphotoandthephrase“ThisisthebestlatteEVER!#StarbucksRocks.”Well,ofcourse,youinclude“#StarbucksRocks”inyourtweetsothatthetweetgoesintothe#StarbucksRocksstreamandisclassifiedtogetherwithalltheothertweetsthathavebeenlabeledas#StarbucksRocks.YouruseofthehashtaglabelinyourtweettoldTwitterhowtoclassifyyourdataintoarecognizableandaccessiblegroup,orcluster.Whatisthepurposeofclusteringandclassificationalgorithms?Whatareclustersintheirsimplestform?Whatareclusteringalgorithms?3.Howmanymaintypesofclusteringalgorithmsarethere?Whatarethey?4.Whatdoyoualreadyknowinclaasification?5.Howcanyoubetterillustratethenatureofclassification?将下列词填入适当的位置(每词只用一次)。(每小题10分,共20分)填空题1供选择的答案:uniquehierarchicalprocessesincludeacceptableinvolvesaccuracyhackerslinkedissuesTypesofDataIntegrityTherearetwotypesofdataintegrity:physicalintegrityandlogicalintegrity.Bothareacollectionofprocessesandmethodsthatenforcedataintegrityinboth___1___andrelationaldatabases.PhysicalintegrityPhysicalintegrityistheprotectionofdata’swholenessand___2___asit’sstoredandretrieved.Whennaturaldisastersstrike,powergoesout,orhackersdisruptdatabasefunctions,physicalintegrityiscompromised.Humanerror,storageerosion,andahostofother___3___canalsomakeitimpossiblefordataprocessingmanagers,systemprogrammers,applicationsprogrammers,andinternalauditorstoobtainaccuratedata.LogicalintegrityLogicalintegritykeepsdataunchangedasit’susedindifferentwaysinarelationaldatabase.Logicalintegrityprotectsdatafromhumanerrorand___4___aswell,butinamuchdifferentwaythanphysicalintegritydoes.Therearefourtypesoflogicalintegrity.2.1EntityintegrityEntityintegrityreliesonthecreationofprimarykeys,or___5___valuesthatidentifypiecesofdata,toensurethatdataisn’tlistedmorethanonceandthatnofieldinatableisnull.It’safeatureofrelationalsystemswhichstoredataintablesthatcanbe___6___andusedinavarietyofways.2.2ReferentialintegrityReferentialintegrityreferstotheseriesof___7___thatmakesuredataisstoredanduseduniformly.Rulesembeddedintothedatabase’sstructureabouthowforeignkeysareusedensurethatonlyappropriatechanges,additions,ordeletionsofdataoccur.Rulesmay___8___constraintsthateliminatetheentryofduplicatedata,guaranteethatdataisaccurate,and/ordisallowtheentryofdatathatdoesn’tapply.2.3DomainintegrityDomainintegrityisthecollectionofprocessesthatensuretheaccuracyofeachpieceofdatainadomain.Inthiscontext,adomainisasetof___9___valuesthatacolumnisallowedtocontain.Itcanincludeconstraintsandothermeasuresthatlimittheformat,type,andamountofdataentered.2.4User-definedintegrityUser-definedintegrity___10___therulesandconstraintscreatedbytheusertofittheirparticularneeds.Sometimesentity,referential,anddomainintegrityaren’tenoughtosafeguarddata.Often,specificbusinessrulesmustbetakenintoaccountandincorporatedintodataintegritymeasures.填空题2供选择的答案:programsarchitecturelayerhandlingcreatecenterinfrastructurenetworksstoragemachinesBigDataCloudReferenceArchitectureThecloudarchitectureforbigdataisefficienttomanagecomplicatedcomputingscalability,storage,andnetworkinginfrastructure.Theinfrastructureasserviceprovidersmainlydealswithservers,___1___,inadditiontostorageapplicationsandoffersfacilitiessuchasvirtualization,basicmonitoringandsafety,operatingsystem,serverinadata___2___,andstorageservices.Thefourlayersofbigdatacloudarchitecturearediscussedbelow:BigDataAnalytics-SoftwareasaService(BDA-SaaS):Theanalyticsofbigdataofferedasservicegivesusersthecapabilitytoquicklyworkonanalyticswithoutspendingon___3___andpayforthefacilitiesused.Thefunctionsofthislayerare:•Arrangementofsoftwareapplicationsrepository•Software___4___deploymentontheinfrastructure•Resultdeliverytotheusers.BigDataAnalytics-PlatformasaService(BPaaS):Thisisthesecondlayerofthe___5___.Itisthecorelayerthatprovidesplatform-relatedservicestoworkwithstoredbigdataandcomputing.Datamanagementtools,schedulers,andprogrammingenvironmentsfordata-intensiveanddataprocessingtasks,whichareconsideredasmiddlewaremanagementtoolsresideinthisregion.This___6___responsiblefordevelopingsoftwaredevelopmentkitsandtoolsnecessaryforanalytics.BigDataFabric(BDF):Thisisthefabriclayerofbigdata,responsibleforaddressingtoolsandAPIsthatsupportthe___7___ofdata,datacomputation,andaccesstodifferentapplicationservices.ThislayercomprisesAPIsandinteroperableprotocoldesignedtoconnectthespecifiedmultiplecloudinfrastructuralstandards.CloudInfrastructure(CI):Thecloudinfrastructureisresponsiblefor___8___theinfrastructurefordatastorageandcomputationasservices.TheservicesofferedbyCIlayerareasfollows:●Tocreatelarge-scaleelasticinfrastructureforbigdatastorage,capableofon-demanddeployment.●Tosetupdynamicvirtual___9___.●Togenerateson-demandstoragefacilitiesthatrelatetobigdatamanagementforfile,block,andobject-based.●Toenableseamlesspassageofdataacrossthestoragerepositories.●To___10___virtualmachinesandtomountthefilesystemwiththecomputenode.短文翻译(每小题10分,共20分)翻译题1DataCleaningWhatisdatacleaning?Datacleaningistheprocessoffixingorremovingincorrect,corrupted,incorrectlyformatted,duplicate,orincompletedatawithinadataset.Datacleaning,whichisalsoreferredtoasdatacleansinganddatascrubbing,isoneofthemostimportantstepsforyourorganizationifyouwanttocreateaculturearoundqualitydatadecision-making.Datacleaningisnotsimplyabouterasinginformationtomakespacefornewdata,butratherfindingawaytomaximizeadataset’saccuracywithoutnecessarilydeletinginformation.Datacleaningincludesmoreactionsthanremovingdata,suchasfixingspellingandsyntaxerrors,standardizingdatasets,andcorrectingmistakessuchasemptyfields,missingcodes,andidentifyingduplicatedatapoints.Mostimportantly,thegoalofdatacleaningistocreatedatasetsthatarestandardizedanduniformtoallowbusinessintelligenceanddataanalyticstoolstoeasilyaccessandfindtherightdataforeachquery.Whatisthedifferencebetweendatacleaninganddatatransformation?Datacleaningistheprocessthatremovesdatathatdoesnotbelonginyourdataset.Datatransformationistheprocessofconvertingdatafromoneformatorstructureintoanother.Transformationprocessescanalsobereferredtoasdatawrangling,ordatamunging,transformingandmappingdatafromone"raw"dataformintoanotherformatforwarehousingandanalyzing.BenefitsofdatacleaningHavingcleandatawillultimatelyincreaseoverallproductivityandallowforthehighestqualityinformationinyourdecision-making.Thebenefitsinclude:●Removaloferrorswhenmultiplesourcesofdataareatplay.●Fewererrorsmakeforhappierclientsandless-frustratedemployees.●Abilitytomapthedifferentfunctionsandwhatyourdataisintendedtodo.●Monitoringerrorsandbetterreportingtoseewhereerrorsarecomingfrom,makingiteasiertofixincorrectorcorruptdataforfutureapplications.●Usingtoolsfordatacleaningwillmakeformoreefficientbusinesspracticesandquickerdecision-making.翻译题2DataVisualization Datavisualizationisthepracticeoftranslatinginformationintoavisualcontext,suchasamaporgraph,tomakedataeasierforthehumanbraintounderstandandpullinsightsfrom.Themaingoalofdatavisualizationistomakeiteasiertoidentifypatterns,trendsandoutliersinlargedatasets.Thetermisoftenusedinterchangeablywithothers,includinginformationgraphics,informationvisualizationandstatisticalgraphics.Datavisualizationisoneofthestepsofthedatascienceprocess,whichstatesthatafterdatahasbeencollected,processedandmodeled,itmustbevisualizedforconclusionstobemade.Datavisualizationisalsoanelementofthebroaderdatapresentationarchitecture(DPA)discipline,whichaimstoidentify,locate,manipulate,formatanddeliverdatainthemostefficientwaypossible.Datavisualizationisimportantforalmosteverycareer.Itcanbeusedbyteacherstodisplaystudenttestresults,bycomputerscientistsexploringadvancementsinartificialintelligence(AI)orbyexecutiveslookingtoshareinformationwithstakeholders.Italsoplaysanimportantroleinbigdataprojects.Asbusinessesaccumulatedmassivecollectionsofdataduringtheearlyyearsofthebigdatatrend,theyneededawaytoquicklyandeasilygetanoverviewoftheirdata.Visualizationtoolswereanaturalfit.Visualizationiscentraltoadvancedanalyticsforsimilarreasons.Whenadatascientistiswritingadvancedpredictiveanalyticsormachinelearning(ML)algorithms,itbecomesimportanttovisualizetheoutputstomonitorresultsandensurethatmodelsareperformingasintended.Thisisbecausevisualizationsofcomplexalgorithmsaregenerallyeasiertointerpretthannumericaloutputs.Datavisualizationprovidesaquickandeffectivewaytocommunicateinformationinauniversalmannerusingvisualinformation.Thepracticecanalsohelpbusinessesidentifywhichfactorsaffectcustomerbehavior;pinpointareasthatneedtobeimprovedorneedmoreattention;makedatamorememorableforstakeholders;understandwhenandwheretoplacespecificproducts;andpredictsalesvolumes.Otherbenefitsofdatavisualizationinclude:●theabilitytoabsorbinformationquickly,improveinsightsandmakefasterdecisions;●anincreasedunderstandingofthenextstepsthatmustbetakentoimprovetheorganization;●animprovedabilitytomaintaintheaudience'sinterestwithinformationtheycanunderstand;●aneasydistributionofinformationthatincreasestheopportunitytoshareinsightswitheveryoneinvolved;●eliminatingtheneedfordatascientistssincedataismoreaccessibleandunderstandable;and●anincreasedabilitytoactonfindingsquicklyand,therefore,achievesuccesswithgreaterspeedandlessmistakes.
参考试卷答案一、写出以下单词的中文意思(每小题0.5分,共10分)1accumulatev.堆积,积累11authenticationn.身份验证;认证2operationn.操作;运算12malwaren.恶意软件,流氓软件3complexityn.复杂性13ransomwaren.勒索软件4filteringn.过滤14vulnerabilityn.弱点;脆弱性5leakagen.漏出;泄露15processvt.加工;处理6enginen.引擎,发动机16validityn.有效性,合法性7recoveryn.恢复,复原17interpretationn.解释,说明8storagen.贮存18classificationn.分类,归类9ensurevt.确保19elementn.元素;要素;原理10accumulatev.堆积,积累20executableadj.可执行的;实行的二、根据给出的中文意思,写出英文单词(每小题0.5分,共10分)1n.元数据metadata11n.并发(性)concurrency2n.特性;属性property12n.数据库database3n.服务器server13adj.程序的,过程的procedural4n.推荐引擎;推荐系统recommender14n.仓库;贮藏室repository5n.标准,规格standard15vt.收集;采集gather6adj.定性的qualitative16n.聚集;集成;集结aggregation7n.登记,注册registration17adj.跨平台的cross-platform8n.备份backup18vt.取得,获得;实现achieve9n.容量;性能capacity19n.体系结构;(总体、层次)结构architecture10n.冗余;过多,过剩redundancy20v.担保;确保n.保证;保修单guarantee1dataflow数据流2datamart数据集市3datamining数据挖掘4datasharing数据共享5datadefinition数据定义6datastorage数据存储7datavisualization数据可视化8operatingsystem操作系统9semi-structureddata半结构化数据10sampledata样本数据四、根据给出的中文意思,写出英文短语(每小题1分,共10分)1非结构化数据unstructureddata2层次数据模型,分级数据模型hierarchicaldatamodel3文本分析textanalysis4数据点datapoint5数据收集datacollection6自治数据库autonomousdatabases7数据仓库datawarehouse8混合云hybridcloud9机器学习machinelearning10非关系数据库nonrelationaldatabase五、写出以下缩略语的完整形式和中文意思(每小题1分,共10分)缩略语完整形式中文意思1AIArtificialIntelligence人工智能2BDFBigDataFabric大数据结构3CMSContentManagementSystem内容管理系统4APIApplicationProgrammingInterface应用程序编程接口5DDLDataDefinitionLanguage数据定义语言6DMLDataManipulationLanguage数据操作语言7DQLDataQueryLanguage数据查询语言8ELTExtract,Load,Transform提取、加载、转换9JVMJavaVirtualMachineJava虚拟机10SLAServiceLevelAgreement服务等级协议,服务级别协议六、阅读短文,回答问题(每小题2分,共10分)Thepurposeofclusteringandclassificationalgorithmsistomakesenseofandextractvaluefromlargesetsofstructuredandunstructureddata.Intheirsimplestform,clustersaresetsofdatapointsthatsharesimilarattributes,andclusteringalgorithmsarethemethodsthatgroupthesedatapointsintodifferentclustersbasedontheirsimilarities.Therearetwomaintypesofclusteringalgorithms.Theyarehierarchicalalgorithmsandpartitionalalgorithms.Inclassification,beforeyoustart,youalreadyknowthenumberofclassesintowhichyourdatashouldbegroupedandyoualreadyknowwhatclassyouwanteachdatapointtobeassigned.Tobetterillustratethenatureo
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025供需合同样本范文
- 2025年饮料代理销售合同书范本
- 2025年汽车租赁合同模板版
- 眼部脓肿个案护理
- 黑龙江省哈尔滨市第九中学校2024-2025学年高二上学期期末考试生物试题 含解析
- 流体力学与医学的交叉应用
- 河北省石家庄市部分校沧州市2024-2025学年高一年级下学期期中考试语文试题
- 人教版小学语文三年级下册第三单元测试题
- 小学音乐课教学心得体会模版
- 【FCMConsulting】2024年第一季度全球旅行趋势报告224mb
- 除四害消杀服务承包合同协议书范本标准版
- 医疗废物管理PPT演示课件
- 项目管理课件-1
- 高等代数课件(北大版)第九章 欧式空间§9.1.1
- ACEI-糖尿病患者的心脏保护
- 三维地震勘探施工设计
- 宫腔粘连临床路径
- mh fg2000ab普通说明书使用服务及配件手册
- 2023学年完整公开课版破十法
- 白色中山大学本科生毕业设计答辩PPT下载
- 04G353-6 钢筋混凝土屋面梁
评论
0/150
提交评论