Chep06 : Resilient dCache: Replicating Files for Integrity and Availability
Title
Resilient dCache: Replicating Files for Integrity and Availability
Author(s)
Alex Kulyavtsev |
FERMI |
aik@fnal.gov |
for the dCache team |
Abstract
dCache is a distributed storage system currently used to store and
deliver data on a petabyte scale in several large HEP experiments. Initially
dCache was designed as a disk front-end for robotic tape storage file systems.
Lately, dCache systems have been increased in scale by several orders of
magnitude and considered for deployment in US-CMS T2 centers lacking expensive
tape robots. This created the need to store data for extended periods of time
on disk-only storage systems, in many cases using very inexpensive commodity
(non-RAID) disk devices purchased specifically for storage or using
opportunistically exploiting spare disk space in computing farms, adding
hundreds of Terabytes of storage for little additional cost. Large number of
nodes in computing cluster and lesser reliability of commodity disks and
computers leads to the likelihood for individual files to become lost or
unavailable in normal operations.
Resilient dCache is new top level dCache service to address these
reliability and file availability issues by keeping several replicas of each
logical file on elements of different dCache disk hardware. The Resilience
Manager automatically keeps the number of copies in the system within a
specified range when files are stored in or removed from dCache, or disk pool
nodes are found to have crashed, been removed from, or added to the system. The
Resilience Manager maintains a local file replica catalog and disk pool
configuration in Postgres DB.
The paper describes the design of dCache Resilience Manager and
experience in the production deployment and operations in US-CMS T1 and T2
centers. We use the configuration "all pools are resilient" in US-CMS T2
centers to store generated data before they are stored in T1 center. The US-CMS
T1 center has some pools in the single dCache system configured as resilient,
while the other pools are tape-backed or volatile. Such a configuration
simplifies the administration of the system and data exchange. We attribute the
increase in amount of data delivered to compute nodes from dCache US-CMS T1
center (0.2 PB/day in October 2005) to the data stored in resilient pools.