DIRAC Data Management conistency, integrity and coherence of 狄拉克的数据管理的一致性完整性与一致性_第1页
DIRAC Data Management conistency, integrity and coherence of 狄拉克的数据管理的一致性完整性与一致性_第2页
DIRAC Data Management conistency, integrity and coherence of 狄拉克的数据管理的一致性完整性与一致性_第3页
DIRAC Data Management conistency, integrity and coherence of 狄拉克的数据管理的一致性完整性与一致性_第4页
DIRAC Data Management conistency, integrity and coherence of 狄拉克的数据管理的一致性完整性与一致性_第5页
已阅读5页,还剩17页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、marianne bargiotticernmarianne bargiottichep07 sep 2-7 2007, victoria, bc2outlineo dirac data management system (dms)o lhcb catalogues and physical storageo dms integrity checkingo conclusionsmarianne bargiottichep07 sep 2-7 2007, victoria, bc3dirac data management systemo dirac project (distributed

2、 infrastructure with remote agent control) is the lhcb workload and data management systemqdirac architecture based on services and agentssee a. tsaregorodsev poster 189o the dirac data management system deals with three components: allows to know where files are storedallows to know what are the co

3、ntents of the filesunderlying grid storage elements (se) where files are storedconsistency between these catalogues and storage elements is for a reliable data managementsee a.c smith poster 195marianne bargiottichep07 sep 2-7 2007, victoria, bc4lhcb file catalogueo lhcb choice: lcg file catalogue (

4、lfc)qit allows registering and retrieving the location of physical replicas in the grid infrastructure. qit stores:file information (lfn, size, guid)replica informationo dirac wms uses lfc information to decide where jobs can be scheduledfundamental to avoid any kind of inconsistencies both with sto

5、rages and with related catalogues (bk meta data db)o baseline choice for dirac: central lfc qone single master (r/w) and many ro mirrorsqcoherence ensured by single write endpoint marianne bargiottichep07 sep 2-7 2007, victoria, bc5registration of replicaso guid check: before the registration in the

6、 lcg file catalogue, at the beginning of transfer phase, the existence of file guid to be transferred is checkedqto avoid guid mismatch problem in registrationo after a successful transfer, lfc registration of files is divided into 2 atomic operationsq booking of meta data fields with the insertion

7、in the dedicated table of lfn, guid and size q replica registration marianne bargiottichep07 sep 2-7 2007, victoria, bc6lhcb bookkeeping meta data dbo the bookkeeping (bk) is the system that stores data provenience information.o it contains information about jobs and files and their relations: qjob:

8、 application name, application version, application parameters, which files it has generated etc.qfile: size, event, filename, guid, from which job it was generated etc.o the bookkeeping db represents the main gateway for users to select the available data and datasets.o all data visible to users ar

9、e flagged as has replica all the data stored in the bk and flagged as having replica, must be correctly registered and available in lfc.marianne bargiottichep07 sep 2-7 2007, victoria, bc7storage elementso dirac storage element clientqprovides uniform access to grid storage elementsqimplemented with

10、 plug-in modules for access protocolssrm, gridftp, bbftp, sftp, http supportedo srm is the standard interface to grid storageo lhcb has 14 srm endpoints qdisk and tape storage for each t1 site o srm will allow browsing the storage namespace (since srm v2.2)o functionalities are exposed to users thro

11、ugh gfal library api qpython binding of gfal library is used to develop the dirac toolsmarianne bargiottichep07 sep 2-7 2007, victoria, bc8data integrity checkso considering the high number of interactions among dm system components, integrity checking is part of the dirac data management system. o

12、two ways of performing checks: those running as agents within the dirac frameworkthose launched by the data manager to address specific situations. o the agent type of checks can be broken into two further distinct types. qthose solely based on the information found on se/lfc/bkqthose based on a pri

13、ori knowledge of where files should exist based on the computing model i.e dst always present at all t1s disksmarianne bargiottichep07 sep 2-7 2007, victoria, bc9dms integrity agents overviewo the complete suite for integrity checking includes an assortment of agents:marianne bargiottichep07 sep 2-7

14、 2007, victoria, bc10data integrity checks & dm consoleo the data management console is the interface for the data manager.qthe dm console allows data integrity checks to be launched.o the development of these tools has been driven by experienceqmany catalog operations (fixes)bulk extraction of

15、replica informationdeletion of replicas according to sitesextraction of replicas through lfc directorieschange of replicas se name in the cataloguecreations of bulk transfer/removal jobs marianne bargiottichep07 sep 2-7 2007, victoria, bc11bk - lfc consistency agento main problem affecting bk: many

16、lfns registred in the bk but failed to be registred on lfc missing files in the lfc: users trying to select lfns in the bk cant find any replica in the lfcpossible causes: failing of registration on the lfc due to failure on copy, temporary lack of service.o bklfc: performs massive check on producti

17、onsqchecking from bk dumps of different productions against same directories on lfcqfor each production: checking for the existence of each entry from bk against lfccheck on file sizesin case of missing or problematic files, reports to the integritydbbklfcmarianne bargiottichep07 sep 2-7 2007, victo

18、ria, bc12lfc pathologieso many different possible inconsistencies arising in a complex computing model:q file metadata registred on lfc but missing information on size (set to 0):q missing replica field in the replica information table on the db(bugs from dirac old version, now fixed)q srm:/gridka-d

19、cache.fzk.de:8443/castor/cern.ch/grid/lhcb/production/dc06/v1-lumi2/00001354/digi/0000/00001354_00000027_9.digi gridka-tapeq cern_castor, wrong info in the lhcb configuration service q sfn, rfio, bbftp q blank spaces on the surl path, carriage returns, presence of port number in the surl path.marian

20、ne bargiottichep07 sep 2-7 2007, victoria, bc13lfc se consistency agento lfc replicas need perfect coherence with storage replicas both in path, protocol and size:qreplication issue: check whether the lfc replicas are really resident on physical storages (check the existence and the size of files)qr

21、egistration issues: lfc-se agent stores problematic files in central integritydb according to different pathologies:lfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc14se lfc consistency agento checks the se contents against lcg file catalogue:qlists the contents of the se qchecks against the

22、catalogue for corresponding replicas qmissing efficient storage interface for bulk meta data queries (directory listings) not possible to list the content of remote directories and getting associated meta-data (lcg-ls)selfcmarianne bargiottichep07 sep 2-7 2007, victoria, bc15storage usage agento usi

23、ng the registered replicas and their sizes on the lfc, this agent constructs an exhaustive picture of current lhcb storage usage:qworks through breakdown by directoriesqloops on lfc extracting files sizes according to different storagesqstores information on central integritydbqproduce a full pictur

24、e of disk and tape occupancy on each storage qprovides an up-to-dated picture of lhcbs usage of resources in almost real time o foreseen development: using lfc accounting interface to have a global picture per sitemarianne bargiotti16data integrity agent o the integrity agent takes actions over a wi

25、de number of pathologies stored by agents in the integritydb.o action taken:qlfc se:in case of missing replica on lfc: produce surl paths starting from lfn, according to dirac configuration system for all the defined storage elements;extensive search throughout all t1 sesif search successful, regist

26、ration of missing replicas. same action in case of zero-size files, wrong sa-path,.qbk - lfc:if file not present on lfc:extensive research performed on all sesif file is not found anywhere removal of flag has replica: no more visible to usersif file is found: update of lfc with missing file infos ex

27、tracted from storagesqse lfc:files missing from the catalogue can be:registered in catalogue if lfn is presentdeleted from se if lfn is missing on the cataloguebkselfclfcsedirac csselfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc17prevention of inconsistenciesqeach operation that can fail i

28、s wrapped in a xml record as a request which can be stored in a request db.qrequest dbs are sitting in one of the lhcb vo boxes, which ensures that these records will never be lostqthese requests are executed by dedicated agents running on vo boxes, and are retried as many times as needed until they

29、 succeedqexamples: files registration operation, data transfer operation, bk registrationqchecking on file transfers based on file size or checksum, etc.marianne bargiottichep07 sep 2-7 2007, victoria, bc18conclusionso integrity checks suite is an important part of data management activityo further

30、development will be possible with srm v2 (se vs lfc agent)o most effort now in the prevention of inconsistencies (checksums, failover mechanisms)marianne bargiottichep07 sep 2-7 2007, victoria, bc19backupmarianne bargiottichep07 sep 2-7 2007, victoria, bc20dirac architectureo dirac (distributed infr

31、astructure with remote agent control) is the lhcbs grid projecto dirac architecture split into three main component types: qservices - independent functionalities deployed and administered centrally on machines accessible by all other dirac componentsqresources - grid compute and storage resources at remote sitesqagents - lightweight software components that request jobs from the central services for a specific purpose. o the dirac data management system is made up an assortment of these

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论