$(document).ready(function() { $('#toc').toc({autoId: true, exclude: 'h1,#TOC,h4,h5'}); } ); -->

dCache 2.0 Release Notes

Table of contents

Upgrade Instructions

Incompatibilities

Please consider the following changes when upgrading from a version before 2.0.0:

Compatibility

It is safe to mix pools of releases 1.9.12-11, 1.9.13-4 or newer with 2.0. Head nodes and doors must be upgraded to 2.0 together and cannot be mixed with head nodes or doors of releases before 2.0. Components of different 2.0 releases can be mixed freely.

dCache 2.0.3

Service: nfs

Do not log client errors with error level.

Service: webdav

Avoid creation of temporary buffer files in /tmp. These were created during directory listing or when redirect on read was enabled. Instead the door will now use HTTP chunked encoding.

Service: gridftp, kerberosftp

Fix regression in directory listing which resulted in the DN or Kerberos principal rather than the mapped user name being used. The regression was introduced in 1.9.12-8.

Service: httpd

Fix match feature of Pool Selection Configuration page.

Service: xrootd, webdav

Allow embedded gPlazma to be used and allow the gPlazma cell name to use to be configurable.

Service: pool

Fix classpath for pool reconstruct command.

Service: gplazma

The kpwd file contains support for user+password; such files may contain entries that identify a user based on whether the user-supplied password hashes to the stored password hash.
The support for updating the kpwd file was broken. Now you can add a new user to the kpwd file by:
dcache kpwd dcuseradd <newuser> -u <12345> -g <1234> -h / -r / -f / -w read-write -p <password>

Init Scripts

The return code of dcache status now reflects the status of the service.

The dcache status command now lists orphaned domains for which the wrapper process has died. Previously such domains were reported as stopped and the dCache script would happily create a second instance. Now such instances can be killed by dcache stop.

dCache 2.0.2

Service: pool

Fix HTTP return code in case client reads beyond the end of the file (dCache used to return 400 and the correct return code is 416). Resolves an interoperability issue with ARC.

Fix 'dcache pool ls' to show the pool size stored in the pool's setup file, in case such a pool size exists.

Service: xrootd, webdav, ftp, gridftp, kerberosftp

Fix a harmless but noisy null pointer exception. The problem could affect FTP and WebDAV doors too, but was first observed in the xrootd door.

Service: webdav

Fix HTTP basic authentication. Basic authentication was broken due to a null pointer exception being thrown while generating the error page.

Changelog 2.0.1 to 2.0.2

dCache 2.0.1

Service: admin

Fix several bugs related to line editing. Add history search (press Ctrl-R).

An unfortunate side effect of these fixes is that line editing capabilities need to know the terminal size. Some ssh clients do not transmit terminal resize messages. For those clients resizing the terminal after login to the admin service will break line editing.

Fix listing of admin service in dcache services.

Service: webadmin

Fix listing of webadmin service in dcache services.

Fix race condition that caused ConcurrentModificationExceptions.

Service: poolmanager

Streamline the interface of partition plugins. Third party plugins will have to be updated (to our knowledge no such plugins exist yet).

Service: pool

Fix several race conditions that could lead to stale hidden movers, TCP connection leaks and link count inconsistencies.

Fix several HTTP compliance issues related to range request (ie partial reads).

Service: gplazma

Fix nsswitch plugin compatibility issue on 64-bit Debian.

Service: pnfsmanager, nfsv3, nvfsv41

Fix name space corruption when moving directories (chimera only).

Fix RFC 5661 compliance of ACL delete permission check. A file or directory can now be deleted if the subject has the DELETE_CHILD permission on the parent directory, or the DELETE permission on the entry being deleted, or in case neither are specified if the subject has the ADD_FILE permission on the parent directory.

Service: pnfsmanger

Fix instantiation with PNFS backend.

Service: srm

Improve robustness against ill formed srmLs requests.

Service: ftp, gridftp, kerberosftp

Fix race condition that could occur during shutdown.

Make timeout of write mover creation configurable. The timeout is shared with read mover creation. The timeout can be configured by adjusting the gsiftpPoolTimeout property. A side effect is that the default timeout for write mover creation increases from 10 seconds to 10 minutes.

Service: hopping

Add service definition for hopping manager. The name of the new service is hopping.

Service: info

Fix logging of several erroneous warning and error messages.

Info provider

Fix GLUE2 compliance. Be sure you have at least v2.0.8 of glue-schema RPM installed on the node running the info provider.

Miscellaneous

Improve documentation in configuration defaults.

Fix well known cell name resolving when using JMS messaging.

Changelog 2.0.0 to 2.0.1

dCache 2.0.0

Info provider

This patch moves some of the responsibility of configuring the info-provider from the /opt/d-cache/etc/info-provider.xml file to the /opt/d-cache/etc/dcache.conf file. Therefore, you will need to edit dcache.conf so that it contains your configuration. See the defaults file /opt/d-cache/share/defaults/info-provider.properties for the list of affected properties.

Note that, if you are using the default value for an info-provider property then you do not need to configured that property in dcache.conf: the default value will be used automatically.

Servie: nfsv4

Introduced new options to tweak cache:

# maximal number of entries in the cache
nfs.idmap.cache.size = 512

# cache entry maximal lifetime
nfs.idmap.cache.timeout = 30

# time unit used for timeout. Valid values are:
# SECONDS, MINUTES, HOURS and DAYS
nfs.idmap.cache.timeout.unit = SECONDS

GSS support?

Service: webdav

The WebDAV door now supports HTTP basic authentication, aka username and password. The authentication scheme relies on gPlazma 2 to validate passwords. A password enabled gPlazma auth plugin must be used.

Note the HTTP basic authentication is not secure when used over unencrypted HTTP. We recommend combining it with HTTPS.

The following new configuration property was added:

# ---- Whether HTTP Basic authentication is enabled
#
# When enabled a user name and password will be requested on
# authorization failures.
#
# Note that HTTP Basic authentication essentially transfers
# passwords in clear text. A secure setup should only use HTTP Basic
# authentication over HTTPS.
#
webdavBasicAuthentication=false

An example gPlazma 2 configuration with support for password based authentication is:

auth    optional  kpwd
map     requisite kpwd
session requisite kpwd

This required that the dcache.kpwd file contains password entries. Use the dcache kpwd command set to add such entries. Other password enabled gPlazma 2 plugins can of course be used too.

Service: gplazma2

Added nsswitch and krb5 plugins.

TODO: document these

Service: pnfsmanager, acl

Merged the acl service into pnfsmanager service. The acl service is now deprecated. The ACL administrative commands are now available through the PnfsManager cell.

Note that ACLs are only supported with Chimera.

Service: ftp, gridftp, kerberosftp, dcap, gsidcap, xrootd, webdav

Reduced the latency of registering a newly started door with LoginBroker.

Service: poolmanager

PoolManager.conf

Several commands have been obsoleted and are no longer supported. See the box below for a list of obsolete commands and available replacements.

Important: Before upgrading to 2.0, PoolManager.conf MUST be regenerated by using the save command using either 1.9.12-9, 1.9.13-3 or newer. Version 2.0 will fail to start if PoolManager.conf contains any of the obsolete commands. If third party scripts are used to generate PoolManager.conf, then these scripts will likely have to be updated.

Obsolete partition manager commands
rc set stage
Replaced by
        pm set -stage-oncost
        pm set -stage-allowed
rc set sameHostCopy
Replaced by
        pm set -sameHostCopy
rc set max copies
Replaced by
        pm set -max-copies
set pool decision
Replaced by
        pm set -spacecostfactor
        pm set -cpucostfactor
rc set p2p
Replaced by
        pm set -p2p-allowed
        pm set -p2p-oncost
        pm set -p2p-fortransfer
set costcuts
Replaced by
        pm set -p2p
        pm set -alert
        pm set -halt
        pm set -fallback
        pm set -idle
rc set slope
Replaced by
        pm set -slope

Wildcards

Several psu commands now accept wildcard (glob patterns). Use help psu to see which commands accept wildcards.

Partition manager

The pool selection process consists of two parts: The first part uses an admin configured rule engine. This is called the pool selection unit and controls to and from which pools particular files may be written, read or staged. Once a set of candidate pools has been determined, the second step chooses one of those based on criteria such as free space and load.

As of version 2.0, this second step is pluggable. This means that the admin may choose among several selection algorithms, and that third party developers may write custom plugins to further tweak and tune dCache.

To support such pluggable selection algorithms, the partition manager was rewritten. Partition manager allows several sets of parameters (partitions) to be defined and associated with different links. In 2.0 this mechanism has been extended such that the pool selection logic itself is part of a partition. The partition is pluggable such that different links may use different pool selection strategies.

As part of the rewrite several partition manager related commands have changed. In particular pm create now takes a plugin type parameter and the output format of pm ls has changed.

As before, a partition named default is hardcoded into the system. It is used by all links that do not explicitly define the partition to use.

Each partition has its own set of parameters and different types of partitions may support different sets of parameters. All partitions inherit from a global set of parameters. This global set is at the moment identical to the parameters of the default partition, but this may change in the future. That is, the commands pm set -option=value and pm set default -option=value are currently identical, but in a future version the former may be changed to update the global parameter set while the latter only updates the default partition.

The following partition types are supported:

classic
The pool selection algorithm used in previous versions of dCache.
random
Selects a pool randomly.
lru
Selects the pool that has not been used the longest.
wass
An experimental selection algorithm that select pools randomly weighted by available space, while incorporating age and amount of garbage collectible files and information about load. Will be refined in future versions and may one day become the default.

Third party plugins may added to dCache. We will soon publish developer information about how to write such plugins.

Service: pool

The memory overhead of pool meta data was reduced.

The HTTP mover now silently ignores requests for favicon.ico. The change reduces the clutter in the pool log files.

Proportional pool selection in migration module has been updated to use the WASS algorithm. There are no changes to any options and the changes in observable behaviour should be minimal.

Pool to pool transfers used to use DCAP to transfer the file. Starting with version 2.0 all pool to pool transfers use HTTP. This change should be transparent and a WebDAV door is not required. The primary observable change is that the TCP connection between source and destination pool is now created from the destination pool to the source pool. In previous versions the direction was reversed. Since HTTP is classified as a WAN protocol, the port used by the source pool to listen for the TCP connection will be allocated from the WAN port range. The pool CLI commands pp set port and pp set listen are deprecated and replaced by the command pp interface. The pp set listen however calls through to the pp interface command, meaning that old pool configurations will work as is. Important: Due to this change, dCache pools version 2.0 are only compatible with pools of releases 1.9.12-11, 1.9.13-4 and newer.

Service: srm

Overwrite flag

The SRM protocol allows the client to request that existing files are overwritten upon upload. By default dCache rejects to follow this option. Instead it always refuses to overwrite a file.

The configuration property overwriteEnabled allowed this behaviour to be changed. When set to true the SRM would respect the overwrite request of the client. The way this was implemented however meant that the option had to be set in all doors that the SRM could redirect to. Thus one was faced with the choice of either not honoring the clients request or to enable overwrite by default for all other protocols too.

In 2.0 the handling of the overwrite flag in the SRM has changed. When overwriteEnabled is set to true and the client requests to overwrite a file, then the SRM will delete the file before redirecting to another door. This means that an SRM door can now honor the clients request even when all other doors are configured not to overwrite existig files.

Internal delegation of credentials

srmCopy with GridFTP used to use GSI delegation to transfer credentials from the SRM door to the pool. This was a slow and CPU intensive process. The SRM door and pool has now been updated to transfer the credentials through the dCache internal message passing mechanism.

We assume that the message passing mechanism is secure. Care should be taken to properly firewall access to the message passing system. This should be done whether srmCopy is used or not.

Service: pinmanager, pnfsmanager, spacemanager

Switched to BoneCP for database connection pooling. The default number of database connections have increased. We may reduce this in future releases, but for now you should expect that the connection limit for PostgreSQL has to be increased.

Service: xrootd

Added command to the door to kill a mover: kill mover. Note the transfers without a mover cannot be killed. For this reason we may decide to replace the command with a more powerful version in subsequent releases.

The service provider interface (SPI) of authorization and path mapping plugins has changed. To our knowledge no third party plugins exist at the moment. The updated SPI makes it easier to develop and deploy third party plugins. Developer information about the SPI will be published in the near future.

Service: topo

Added the get hostname to the System cell. This command is used by the topo service to collect the host names of all dCache nodes. The collection can be queried through the topo cell using the updatehostnames and getallhostnames commands.

Changelog 1.9.13-1 to 2.0.0

Greyed out entries have been merged into the 1.9.13 branch.