




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、marianne bargiotticernmarianne bargiottichep07 sep 2-7 2007, victoria, bc2outlineo dirac data management system (dms)o lhcb catalogues and physical storageo dms integrity checkingo conclusionsmarianne bargiottichep07 sep 2-7 2007, victoria, bc3dirac data management systemo dirac project (distributed
2、 infrastructure with remote agent control) is the lhcb workload and data management systemqdirac architecture based on services and agentssee a. tsaregorodsev poster 189o the dirac data management system deals with three components: allows to know where files are storedallows to know what are the co
3、ntents of the filesunderlying grid storage elements (se) where files are storedconsistency between these catalogues and storage elements is for a reliable data managementsee a.c smith poster 195marianne bargiottichep07 sep 2-7 2007, victoria, bc4lhcb file catalogueo lhcb choice: lcg file catalogue (
4、lfc)qit allows registering and retrieving the location of physical replicas in the grid infrastructure. qit stores:file information (lfn, size, guid)replica informationo dirac wms uses lfc information to decide where jobs can be scheduledfundamental to avoid any kind of inconsistencies both with sto
5、rages and with related catalogues (bk meta data db)o baseline choice for dirac: central lfc qone single master (r/w) and many ro mirrorsqcoherence ensured by single write endpoint marianne bargiottichep07 sep 2-7 2007, victoria, bc5registration of replicaso guid check: before the registration in the
6、 lcg file catalogue, at the beginning of transfer phase, the existence of file guid to be transferred is checkedqto avoid guid mismatch problem in registrationo after a successful transfer, lfc registration of files is divided into 2 atomic operationsq booking of meta data fields with the insertion
7、in the dedicated table of lfn, guid and size q replica registration marianne bargiottichep07 sep 2-7 2007, victoria, bc6lhcb bookkeeping meta data dbo the bookkeeping (bk) is the system that stores data provenience information.o it contains information about jobs and files and their relations: qjob:
8、 application name, application version, application parameters, which files it has generated etc.qfile: size, event, filename, guid, from which job it was generated etc.o the bookkeeping db represents the main gateway for users to select the available data and datasets.o all data visible to users ar
9、e flagged as has replica all the data stored in the bk and flagged as having replica, must be correctly registered and available in lfc.marianne bargiottichep07 sep 2-7 2007, victoria, bc7storage elementso dirac storage element clientqprovides uniform access to grid storage elementsqimplemented with
10、 plug-in modules for access protocolssrm, gridftp, bbftp, sftp, http supportedo srm is the standard interface to grid storageo lhcb has 14 srm endpoints qdisk and tape storage for each t1 site o srm will allow browsing the storage namespace (since srm v2.2)o functionalities are exposed to users thro
11、ugh gfal library api qpython binding of gfal library is used to develop the dirac toolsmarianne bargiottichep07 sep 2-7 2007, victoria, bc8data integrity checkso considering the high number of interactions among dm system components, integrity checking is part of the dirac data management system. o
12、two ways of performing checks: those running as agents within the dirac frameworkthose launched by the data manager to address specific situations. o the agent type of checks can be broken into two further distinct types. qthose solely based on the information found on se/lfc/bkqthose based on a pri
13、ori knowledge of where files should exist based on the computing model i.e dst always present at all t1s disksmarianne bargiottichep07 sep 2-7 2007, victoria, bc9dms integrity agents overviewo the complete suite for integrity checking includes an assortment of agents:marianne bargiottichep07 sep 2-7
14、 2007, victoria, bc10data integrity checks & dm consoleo the data management console is the interface for the data manager.qthe dm console allows data integrity checks to be launched.o the development of these tools has been driven by experienceqmany catalog operations (fixes)bulk extraction of
15、replica informationdeletion of replicas according to sitesextraction of replicas through lfc directorieschange of replicas se name in the cataloguecreations of bulk transfer/removal jobs marianne bargiottichep07 sep 2-7 2007, victoria, bc11bk - lfc consistency agento main problem affecting bk: many
16、lfns registred in the bk but failed to be registred on lfc missing files in the lfc: users trying to select lfns in the bk cant find any replica in the lfcpossible causes: failing of registration on the lfc due to failure on copy, temporary lack of service.o bklfc: performs massive check on producti
17、onsqchecking from bk dumps of different productions against same directories on lfcqfor each production: checking for the existence of each entry from bk against lfccheck on file sizesin case of missing or problematic files, reports to the integritydbbklfcmarianne bargiottichep07 sep 2-7 2007, victo
18、ria, bc12lfc pathologieso many different possible inconsistencies arising in a complex computing model:q file metadata registred on lfc but missing information on size (set to 0):q missing replica field in the replica information table on the db(bugs from dirac old version, now fixed)q srm:/gridka-d
19、cache.fzk.de:8443/castor/cern.ch/grid/lhcb/production/dc06/v1-lumi2/00001354/digi/0000/00001354_00000027_9.digi gridka-tapeq cern_castor, wrong info in the lhcb configuration service q sfn, rfio, bbftp q blank spaces on the surl path, carriage returns, presence of port number in the surl path.marian
20、ne bargiottichep07 sep 2-7 2007, victoria, bc13lfc se consistency agento lfc replicas need perfect coherence with storage replicas both in path, protocol and size:qreplication issue: check whether the lfc replicas are really resident on physical storages (check the existence and the size of files)qr
21、egistration issues: lfc-se agent stores problematic files in central integritydb according to different pathologies:lfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc14se lfc consistency agento checks the se contents against lcg file catalogue:qlists the contents of the se qchecks against the
22、catalogue for corresponding replicas qmissing efficient storage interface for bulk meta data queries (directory listings) not possible to list the content of remote directories and getting associated meta-data (lcg-ls)selfcmarianne bargiottichep07 sep 2-7 2007, victoria, bc15storage usage agento usi
23、ng the registered replicas and their sizes on the lfc, this agent constructs an exhaustive picture of current lhcb storage usage:qworks through breakdown by directoriesqloops on lfc extracting files sizes according to different storagesqstores information on central integritydbqproduce a full pictur
24、e of disk and tape occupancy on each storage qprovides an up-to-dated picture of lhcbs usage of resources in almost real time o foreseen development: using lfc accounting interface to have a global picture per sitemarianne bargiotti16data integrity agent o the integrity agent takes actions over a wi
25、de number of pathologies stored by agents in the integritydb.o action taken:qlfc se:in case of missing replica on lfc: produce surl paths starting from lfn, according to dirac configuration system for all the defined storage elements;extensive search throughout all t1 sesif search successful, regist
26、ration of missing replicas. same action in case of zero-size files, wrong sa-path,.qbk - lfc:if file not present on lfc:extensive research performed on all sesif file is not found anywhere removal of flag has replica: no more visible to usersif file is found: update of lfc with missing file infos ex
27、tracted from storagesqse lfc:files missing from the catalogue can be:registered in catalogue if lfn is presentdeleted from se if lfn is missing on the cataloguebkselfclfcsedirac csselfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc17prevention of inconsistenciesqeach operation that can fail i
28、s wrapped in a xml record as a request which can be stored in a request db.qrequest dbs are sitting in one of the lhcb vo boxes, which ensures that these records will never be lostqthese requests are executed by dedicated agents running on vo boxes, and are retried as many times as needed until they
29、 succeedqexamples: files registration operation, data transfer operation, bk registrationqchecking on file transfers based on file size or checksum, etc.marianne bargiottichep07 sep 2-7 2007, victoria, bc18conclusionso integrity checks suite is an important part of data management activityo further
30、development will be possible with srm v2 (se vs lfc agent)o most effort now in the prevention of inconsistencies (checksums, failover mechanisms)marianne bargiottichep07 sep 2-7 2007, victoria, bc19backupmarianne bargiottichep07 sep 2-7 2007, victoria, bc20dirac architectureo dirac (distributed infr
31、astructure with remote agent control) is the lhcbs grid projecto dirac architecture split into three main component types: qservices - independent functionalities deployed and administered centrally on machines accessible by all other dirac componentsqresources - grid compute and storage resources at remote sitesqagents - lightweight software components that request jobs from the central services for a specific purpose. o the dirac data management system is made up an assortment of these
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 智慧小镇申论题目及答案
- 六年级数学教研组工作计划(5篇)
- 食品饮料餐饮销售市场推广拓展计划
- 农业生产技术规范及标准
- 中级爆破工程考试试题及答案
- 2025年山东省烟台市事业单位工勤技能考试题库(含答案)
- CN120269262A 一种装修用支架焊接的辅助工装 (苏州卿露扬精密机械科技有限公司)
- CN120111157B 基于fpga的高清图像旋转系统及方法 (成都维德青云电子有限公司)
- UPS电池安装安全培训课件
- CN120107836A 一种体育跑道路面损伤识别方法 (泉州信息工程学院)
- 石油管道保护施工方案
- 2025秋开学典礼 校长引用电影《长安的荔枝》讲话:荔枝尚早,路正长远-在时光中奔跑,用行动送达自己的“长安”
- 中级经济师模拟试题及答案
- 家庭食品卫生知识培训课件
- 无人机应用技术培训教材
- 地铁安保培训课件
- 华中数控车床课件
- 2025年食品安全监督员专业技能考核试题及答案解析
- 七年级初一新生家长会上校长走心讲话:陪孩子一起长大是一场不能重来的旅程
- 企业微信办公使用教程
- 学堂在线 大学历史与文化 章节测试答案
评论
0/150
提交评论