CHAPTER 12. dCache STORAGE RESOURCE MANAGER
Storage Resource Managers (SRMs) are middleware components whose function is to provide dynamic space allocation and file management on shared storage components on the Grid. SRMs support protocol negotiation and a reliable replication mechanism. The SRM specification standardizes the interface, thus allowing for a uniform access to heterogeneous storage elements.
The SRM utilizes the Grid Security Infrastructure (GSI) for authentication. The SRM is a Web Service implementing a published WSDL document. Please visit the SRM Working Group Page to check out SRM Version 2.2 protocol specification documents.
The SRM protocol uses HTTP over GSI as a transport. The dCache SRM implementation added HTTPS as a transport layer option. The main benefits of using HTTPS rather than HTTP over GSI is that HTTPS is a standard protocol and has support for sessions, improving latency in case a client needs to connect to the same server multiple times.
- Configuring the srm service
- Utilization of space reservations for data storage
- Configuring the PostgreSQL database
- General SRM concepts (for developers)
Configuring the srm service
Basic setup
The SRM service is split between a front end srm
and a backend srmmanager
for scalability. To instantiate the SRM service both cells need to be started, not necessarily on the same host. The srmmanager
is covered in more detail here.
Like other services, the srm service can be enabled in the layout file /etc/dcache/layouts/mylayout
of your dCache installation. For an overview of the layout file format, please see the section “Creating a minimal dCache configuration”.
Example:
To enable SRM in dCache, add the following lines to your layout file:
[<srm-${host.name}Domain>]
[<srm-${host.name}Domain>/srm]
[srmmanager-${host.name}Domain]
[srmmanager-${host.name}Domain/srmmanager]
[srmmanager-${host.name}Domain/transfermanagers]
The additional transfermanagers
service is required to perform 3rd party copy transfers initiated by SRM or WebDAV. This service is not required to be co-located with th SRM service (domain or host).
The srm service requires an authentication setup, see Chapter 10, Authorization in dCache for a general description or the section “Authentication and Authorization in dCache” for an example setup with X.509 certificates.
You can now copy a file into your dCache using the SRM,
NOTE
Please make sure to use the latest srmcp client, otherwise you will need to specify
-2
in order to use the right version.
srmcp file:////bin/sh srm://dcache.example.org/data/world-writable/srm-test-file
copy it back
srmcp srm://dcache.example.org/data/world-writable/srm-test-file file:////tmp/srmtestfile.tmp
and delete it
srmrm srm://dcache.example.org/data/world-writable/srm-test-file
Important SRM configuration options
The defaults for the following configuration parameters can be found in the srmmanager.properties
, srm.properties
and transfermanagers.properties
files, which are all located in the directory /usr/share/dcache/defaults
.
If you want to modify parameters, copy them to /etc/dcache/dcache.conf
or to your layout file /etc/dcache/layouts/mylayout
and update their value.
Example:
Change the value for srmmanager.db.host
in the layout file.
[srm-${host.name}Domain]
[srm-${host.name}Domain/srmmanager]
srmmanager.db.host=hostname
If a dCache instance contains more than one srmmanager
, it is necessary that each one has a distinct database.
The property srmmanager.request.copy.max-inprogress
controls number of copy requests in the running state. Copy requests are 3-rd party srm transfers and therefore the property transfermanagers.limits.external-transfers
is best to be set to the same value as shown below.
srmmanager.request.copy.max-inprogress=250
transfermanagers.limits.external-transfers=${srm.request.copy.threads}
The common value should be roughly equal to the maximum number of the SRM - to - SRM copies your system can sustain.
Example:
So if you think about 3 gridftp transfers per pool and you have 30 pools then the number should be 3x30=90.
srmmanager.request.copy.max-inprogress=90
transfermanagers.limits.external-transfers=90
Utilization of space reservations for data storage
SRM
version 2.2 introduced the concept of space reservation. Space reservation guarantees that the requested amount of storage space of a specified type is made available by the storage system for a specified amount of time. Users can create space reservations using an appropriate SRM
client, although it is more common for the dCache administrator to make space reservations for VOs.
Space reservation is managed by the spacemanager
service which is detailed here.
Configuring the PostgreSQL database
We highly recommend to make sure that PostgreSQL database files are stored on a separate disk that is not used for anything else (not even PSQL logging). BNL Atlas Tier 1 observed a great improvement in srm-database communication performance after they deployed PSQL on a separate dedicated machine.
SRM or SRM monitoring on a separate node
If SRM
or srm monitoring is going to be installed on a separate node, you need to add an entry in the file /var/lib/pgsql/data/pg_hba.conf
for this node as well:
host all all <monitoring node> trust
host all all <srm node> trust
The file postgresql.conf
should contain the following:
#to enable network connection on the default port
max_connections = 100
port = 5432
...
shared_buffers = 114688
...
work_mem = 10240
...
#to enable autovacuuming
stats_row_level = on
autovacuum = on
autovacuum_vacuum_threshold = 500 # min # of tuple updates before
# vacuum
autovacuum_analyze_threshold = 250 # min # of tuple updates before
# analyze
autovacuum_vacuum_scale_factor = 0.2 # fraction of rel size before
# vacuum
autovacuum_analyze_scale_factor = 0.1 # fraction of rel size before
#
# setting vacuum_cost_delay might be useful to avoid
# autovacuum penalize general performance
# it is not set in US-CMS T1 at Fermilab
#
# In IN2P3 add_missing_from = on
# In Fermilab it is commented out
# - Free Space Map -
max_fsm_pages = 500000
# - Planner Cost Constants -
effective_cache_size = 16384 # typically 8KB each
General SRM concepts (for developers)
The SRM service
dCache SRM
is implemented as a Web Service running in a Jetty servlet container and an Axis Web Services engine. The Jetty server is executed as a cell, embedded in dCache and started automatically by the SRM
service. Other cells started automatically by SRM
are SpaceManager
, PinManager
and RemoteGSIFTPTransferManager
. Of these services only SRM
and SpaceManager require special configuration.
The SRM
concept consists of the five categories of functions:
- Space Management
- Data Transfer Functions
- Request Status Functions
- Directory Functions
- Permission Functions
Space management functions
SRM
version 2.2 introduces a concept of space reservation. Space reservation guarantees that the requested amount of storage space of a specified type is made available by the storage system for a specified amount of time. It is detailed in this chapter.
We use three functions for space management:
- srmReserveSpace
- SrmGetSpaceMetadata
- srmReleaseSpace
Space reservation is made using the srmReserveSpace
function. In case of successful reservation, a unique name, called space token is assigned to the reservation. A space token can be used during the transfer operations to tell the system to put the files being manipulated or transferred into an associated space reservation. A storage system ensures that the reserved amount of the disk space is indeed available, thus providing a guarantee that a client does not run out of space until all space promised by the reservation has been used. When files are deleted, the space is returned to the space reservation.
dCache only manages write space, i.e. space on disk can be reserved only for write operations. Once files are migrated to tape, and if no copy is required on disk, space used by these files is returned back into space reservation. When files are read back from tape and cached on disk, they are not counted as part of any space. SRM space reservation can be assigned a non-unique description that can be used to query the system for space reservations with a given description.
Properties of the SRM space reservations can be discovered using the SrmGetSpaceMetadata
function.
Space Reservations might be released with the function srmReleaseSpace
.
For a complete description of the available space management functions please see the SRM Version 2.2 Specification.
Data transfer functions
SURLs and TURLs
SRM
defines a protocol named SRM
, and introduces a way to address the files stored in the SRM
managed storage by site URL (SURL of the format srm://<host>:<port>/[<web service path>?SFN=]<path>
.
Example: Examples of the SURLs a.k.a. SRM URLs are:
srm://fapl110.fnal.gov:8443/srm/managerv2?SFN=//pnfs/fnal.gov/data/test/file1
srm://fapl110.fnal.gov:8443/srm/managerv1?SFN=/pnfs/fnal.gov/data/test/file2
srm://srm.cern.ch:8443/castor/cern.ch/cms/store/cmsfile23
A transfer URL (TURL) encodes the file transport protocol in the URL.
Example: gsiftp://gridftpdoor.fnal.gov:2811/data/test/file1
SRM
version 2.2 provides three functions for performing data transfers:
- srmPrepareToGet
- srmPrepareToPut
- srmCopy
(in SRM
version 1.1 these functions were called get
, put
and copy
).
All three functions accept lists of SURLs as parameters. All data transfer functions perform file/directory access verification and srmPrepareToPut
and srmCopy
check if the receiving storage element has sufficient space to store the files.
srmPrepareToGet
prepares files for read. These files are specified as a list of source SURLs, which are stored in an SRM managed storage element. srmPrepareToGet
is used to bring source files online and assigns transfer URLs (TURLs) that are used for actual data transfer.
srmPrepareToPut
prepares an SRM managed storage element to receive data into the list of destination SURLs. It prepares a list of TURLs where the client can write data into.
Both functions support transfer protocol negotiation. A client supplies a list of transfer protocols and the SRM server computes the TURL using the first protocol from the list that it supports. Function invocation on the Storage Element depends on implementation and may range from simple SURL to TURL translation to stage from tape to disk cache and dynamic selection of transfer host and transfer protocol depending on the protocol availability and current load on each of the transfer server load.
The function srmCopy
is used to copy files between SRM
managed storage elements. If both source and target are local to the SRM
, it performes a local copy. There are two modes of remote copies:
-
PULL mode : The target
SRM
initiates ansrmCopy
request. Upon the client\u0411\u2500\u2265ssrmCopy
request, the targetSRM
makes a space at the target storage, executessrmPrepareToGet
on the sourceSRM
. When the TURL is ready at the sourceSRM
, the targetSRM
transfers the file from the source TURL into the prepared target storage. After the file transfer completes,srmReleaseFiles
is issued to the sourceSRM
. -
PUSH mode : The source
SRM
initiates ansrmCopy
request. Upon the client\u0411\u2500\u2265ssrmCopy
request, the sourceSRM
prepares a file to be transferred out to the targetSRM
, executessrmPrepareToPut
on the targetSRM
. When the TURL is ready at the targetSRM
, the source SRM transfers the file from the prepared source into the prepared target TURL. After the file transfer completes,srmPutDone
is issued to the targetSRM
.
When a specified target space token is provided, the files will be located in the space associated with the space token.
SRM
Version 2.2 srmPrepareToPut
and srmCopy
PULL mode transfers allow the user to specify a space reservation token or a retention policy and access latency. Any of these parameters are optional, and it is up to the implementation to decide what to do, if these properties are not specified. The specification requires that if a space reservation is given, then the specified access latency or retention policy must match those of the space reservation.
The Data Transfer Functions are asynchronous, an initial SRM
call starts a request execution on the server side and returns a request status that contains a unique request token. The status of request is polled periodically by SRM
get request status functions. Once a request is completed and the client receives the TURLs the data transfers are initiated. When the transfers are completed the client notifies the SRM
server by executing srmReleaseFiles
in case of srmPrepareToGet
or srmPutDone
in case of srmPrepareToPut
. In case of srmCopy
, the system knows when the transfers are completed and resources can be released, so it requires no special function at the end.
Clients are free to cancel the requests at any time by execution of srmAbortFiles
or srmAbortRequest
.
Request status functions
The functions for checking the request status are:
- srmStatusOfReserveSpaceRequest
- srmStatusOfUpdateSpaceRequest
- srmStatusOfChangeSpaceForFilesRequest
- srmStatusOfChangeSpaceForFilesRequest
- srmStatusOfBringOnlineRequest
- srmStatusOfPutRequest
- srmStatusOfCopyRequest
Directory functions
SRM
Version 2.2, interface provides a complete set of directory management functions. These are
- srmLs , srmRm
- srmMkDir , srmRmDir
- srmMv
Permission functions
SRM Version 2.2 supports the following three file permission functions:
- srmGetPermission
- srmCheckPermission and
- srmSetPermission
dCache contains an implementation of these functions that allows setting and checking of Unix file permissions.