Executive summary

  • Faster deletion speeds

  • Better integration of alarms service, easier configuration, and predefined alarms

  • Multiple GIDs in kpwd

  • Less clutter in GLUE 2 StorageShare

  • Improved pool response during HSM request bursts

  • Reduced latency on write

  • Automatic throttling of SRM request rate when database request queue is full

  • Decoupling of transfer managers from srm database

Incompatibilities

This section describes potential incompatibilities with previous versions of dCache.

Configuration properties

All properties marked forbidden or obsolete are now no longer recognised by dCache. All properties that were deprecated in 2.10 are now marked forbidden.

Release 2.11.51

Changes affecting multiple services

Fixed an issue with the dcache heap dump command when called with a simple file name as the output path. In this case the dump could in some cases be written to a different directory while the script claimed the dump had failed. The dcache dump heap command has a --force option for cases in which the JVM is unresponsive. This option was ignored for processes not running as root. This is fixed now.

cells

A bouncing message bug in System cell is fixed.

pool

Fix race condition in request scheduler.

Changelog 2.11.50..2.11.51

010e44f
[maven-release-plugin] prepare release 2.11.51
a6949c5
dcache: fix heap dump to simple file names
8080ee3
script: Make dump heap –force work for non-root processes
c5a1f6d
billing: additional fixes to insert triggers
9dcb113
srm: Do not expose TURL before request is ready
0038371
pool: Fix race condition in request scheduler
44c2af1
system-test: update disposable-CA generated credentials
c4f72e7
cells: Avoid bouncing message on no-route errors in System cell
4e11bbe
[maven-release-plugin] prepare for next development iteration

Release 2.11.50

many

When representing checksums in the admin interface and configuration files, checksums are now presented in an improved format.

Changelog 2.11.49..2.11.50

919e8dc
[maven-release-plugin] prepare release 2.11.50
a18e783
common: fix ChecksumType.toString()
d18016b
[maven-release-plugin] prepare for next development iteration

Release 2.11.49

cells

LoginManager would occasionally generate error messages similar to “Discarding listening on $LOCATION 53684’ because its age of 18721640 ms exceeds its time to live of 4500 ms.”. This was due to erroneous reuse of old message envelopes in the internal messaging. This change fixes that problem.

This change addresses a potential problem in which messages sent between cells in the same domain could appear older than they are and thus would risk being discarded due to the time-to-live being expired.

srm

A Tier–1 site reported problems with a major WLCG VO’s read requests. Investigating the source of the problems showed that the srm_ifce library, used by the (outdated) GFAL v1 and the (supported) GFAL v2 SRM libraries, drastically limits the permitted lifetime of requests without providing admins any way to configure this.

For sites seeing errors related to desiredTotalRequestTime being exceeded, this change provides the new configuration option srm.request.maximum-client-assumed-bandwidth in srm.properties as a work-around.

Sites not observing such errors do not need to change anything with regard to this value.

Changelog 2.11.48..2.11.49

f526620
[maven-release-plugin] prepare release 2.11.49
1bdc3e9
srm: add short request lifetime work-around
5577ba2
chimera: Fix regression in inheriting ACLs on directory creation (HSQLDB)
0992e32
cells: Improve robustness of message time to live
b46ca1a
cells: Fix erroneous reuse of message envelope in location manager registration
60bef1d
[maven-release-plugin] prepare for next development iteration

Release 2.11.48

pnfsmanager

Billing entries for SRM uploads recently lost the storage class part of the entry. This update fixes that issue.

poolmanager

This modification fixes a potential race condition in pool manager.

webdav

This modification corrects the error reporting under WebDAV. When attempting to delete a non-existing file, unauthenticated users receive a 401 Unauthorized response, while authenticated users receive a 404 Not Found response.

Changelog 2.11.47..2.11.48

c7fa199
[maven-release-plugin] prepare release 2.11.48
489bbcb
pnfsmanager: Fix regression in SRM billing entries
db0f3f1
poolmanager: Fix race condition in pool selection unit
ba9df79
webdav: fix 404 error if attempting to delete a nonexistent file
ad5442d
[maven-release-plugin] prepare for next development iteration

Release 2.11.47

pnfsmanager

With this release, PnfsManager adds safety checks rejecting invalid upload paths that SRM might erroneously supply. This release hardens an installation against possible bugs triggering data loss.

This release adds a check that detects failed or incomplete SRM uploads and prevents the file from being committed to its final path. Common symptoms of this bug were zero sized files that experiment catalogues registered as successfully uploaded.

srm

This release adds a check to detect broken uploads using SRM during the final stage of file transmission. While it causes transfers to take a little more time, resilience against upload failures is increased.

Changelog 2.11.46..2.11.47

56d90b3
[maven-release-plugin] prepare release 2.11.47
271fa08
srm: Check for broken files during srmPutDone
33713a7
pnfsmanager: Check file size and upload completion when committing temporary upload paths
11931a0
pnfsmanager: Protect against erroneous upload paths
d9e2e57
[maven-release-plugin] prepare for next development iteration

Release 2.11.46

Changes affecting multiple services

Several cases of slow performance were reported while deleting directory in Chimera. This is now fixed.

pool

When command execution to migrate files between pools (e.g. migration concurrency or migration copy) is interrupted due to the failure to find migration job the returned error message is considered as a bug. This is now fixed so that a new message is returned indicating that the job being requested does not exist.

srm

When file upload is cancelled the value of temporary upload path tracked by SRM could be a value different from a regular path, either because it was changed outside of dCache, or it contains entries from a very old version of dCache. This could result in data loss while canceling upload. The current release fixed a potential data loss scenario.

statistics

The statistics service creates static HTML pages that describe dCache usage over time as simple files that the webadmin service can serve. This includes information about pools and store-units. The problem is that the statistics webpages do not show information about any pool or store-unit that contains a / in the name. This is now fixed. A side-effect is that the history of any pool or store-unit containing a ^ in the name is lost.

Changelog 2.11.45..2.11.46

3a6fb84
[maven-release-plugin] prepare release 2.11.46
311c77b
chimera: Alter statistics target for t_tags(itagid)
e180fa7
srm: Add safe-guard against invalid file ID in put requests
fecb34d
pool: Don’t consider failure to find migration job a bug
b1cde64
statistics: encode ‘/’ in filenames
ecdf23d
[maven-release-plugin] prepare for next development iteration

Release 2.11.45

nfs

Pinning files is now a non-blocking operation. For files stored on tape, this should result in a more responsive system behaviour, avoiding NFS blocking in situations with many concurrent pin requests.

Changelog 2.11.44..2.11.45

9474044
[maven-release-plugin] prepare release 2.11.45
8074557
adapt 3eab402754b814f681a14d296d808031b05f2737 to 2.11 branch code
9115ff4
nfs: use noitify instead of blocking sendAndWait when sending pin/unpin messages via touch “.(get)(<file_name>)(pin)” command
1970035
[maven-release-plugin] prepare for next development iteration

Release 2.11.44

Changes affecting multiple services

Sometimes when a cell start up was interrupted an error message was logged as a bug. This is now fixed.

info-provider

The GLUE infomation provider supplies information about the dCache instance, which is important for the clients in WLCG area. Because in dCache different doors can have different roots, clients may need to adjust their path when accessing dCache through different doors. The info-provider is updated so that a new path root property is provided. This allows clients to modify paths, as necessary. Note that the SRM door already supports this translation when redirecting clients for transfers.

Changelog 2.11.43..2.11.44

9476747
[maven-release-plugin] prepare release 2.11.44
6746bd6
info-provider: publish door root path
20e2d4d
cells: Suppress illegal state exception during initialization
a5c6109
[maven-release-plugin] prepare for next development iteration

Release 2.11.43

pool

Revert Netty version back to v3.9.9. A previous release upgraded the version of Netty, but this appears to have introduced problems for some HTTP transfers where the pools run out of memory.

spacemanager

Spacemanager backs off when it encounters a problem writing to the database. Previously, if the problem was due to deadlocks then the two tasks involved are delayed by the same amount, which means it is possible that subsequent attempt will also deadlock. This release randomises the delay to reduce the likelihood of this problem occuring.

Changelog 2.11.42..2.11.43

ccf844b
[maven-release-plugin] prepare release 2.11.43
3e6242d
pool: revert Netty back to v3.9.9
11b4e4b
spacemanager: Randomize backoff in case of transient errors
74e6bf0
[maven-release-plugin] prepare for next development iteration

Release 2.11.42

xrootd

Fix dCache handling of open requests where uploads were considered downloads.

Changelog 2.11.41..2.11.42

5529bae
[maven-release-plugin] prepare release 2.11.42
abc99ab
xrootd: Fix classification of uploads
51f0784
[maven-release-plugin] prepare for next development iteration

Release 2.11.41

Changes affecting multiple services

Don’t log Error while reading from tunnel: java.nio.channels.AsynchronousCloseException when a domain shuts down.

Changelog 2.11.40..2.11.41

17822a3
[maven-release-plugin] prepare release 2.11.41
f38624f
cells: Don’t log AsynchronousCloseException when tunnel closes
09ad53e
[maven-release-plugin] prepare for next development iteration

Release 2.11.40

Changes affecting multiple services

Update the Spring, Milton, AspectJ, Jetty and DataNucleus-core libraries to latest version. All dCache services are affected.

pool

If a 3rd-party transfer fails then the pool may log and report incomplete information on why this happened. This release fixes this problem.

Changelog 2.11.39..2.11.40

94877da
[maven-release-plugin] prepare release 2.11.40
9a50720
2.11: upgrade third party dependencies
e93efef
http–3rd-party: ensure IOException logged with toString
6267fa8
info: fix test to be less critical on timing
d60977f
[maven-release-plugin] prepare for next development iteration

Release 2.11.39

nfs

Report EOF when client undertakes mixed read/write workload and attempts to read beyond currently written data.

poolmanager

Previous releases of dCache contained a bug where replicas generated by pool-to-pool copies failed to include the access latency and retention policy. While not directly affecting dCache operations, the result is that this information is no longer reliable.

spacemanager

Fix listing by PNFS-ID. Glob support is removed as it was non-functional.

Changelog 2.11.38..2.11.39

0f7347a
[maven-release-plugin] prepare release 2.11.39
e4aa4a2
poolmanager: Fix missing access latency and retention policy on pool to pool copy
8ddd796
spacemanager: Fix listing by pnfs id
4611bd1
nfs-proxy: keep track of written bytes
16445a0
[maven-release-plugin] prepare for next development iteration

Release 2.11.38

Changes affecting multiple services

This release fixes a caching issue where changes to inode metadata (e.g., ownership or permissions) for / (the root directory of Chimera) are not visible until the service is restarted. This affects NFS doors and pnfsmanager service.

Changelog 2.11.37..2.11.38

310eb5d
[maven-release-plugin] prepare release 2.11.38
447822a
chimera: Prevent filling of stat cache of root inode
c30f84c
[maven-release-plugin] prepare for next development iteration

Release 2.11.37

Changes affecting multiple services

The chimera library, used by PnfsManager and NFS, contains possible race conditions that can lead to a NullPointerException. These are updated so that Chimera gives the correct error message under these circumstances.

The chimera library, used by the pnfsmanager and nfs services, contains a bug where two near-simultaneous attempts to delete a hitherto empty directory and write a file into the same directory will both succeed but leave an orphaned file: it exists in the t_dirs table but the parent does not exist in t_inodes table. This seems to be triggered when an ftp door fails with no write pool configured. This release fixes this problem.

pnfsmanager

Fix the error message (logged by the domain hosting pnfsmanager) if an attempt to finalise an SRM upload fails within pnfsmanager, or if an attempt to cancel an SRM upload fails within pnfsmanager.

webdav

Update to the latest version of milton.

xrootd

Update the alice-token plugin to allow the host name check to succeed on dual-stack (IPv4 and IPv6) machines.

Changelog 2.11.36..2.11.37

68608ee
[maven-release-plugin] prepare release 2.11.37
77339bc
xrootd: Update alice token plugin to fix IPv6 compatibility
435f26c
chimera: Detect races in directory deletion
d2fdea0
chimera: Detect races during move
cf7c8c5
webdav: update to latest milton
ef188e5
PnfsManager: remove copy-n-paste error in error message
9c4315d
system-test: Fix grid-security settings
7b154e7
rpm: enforce SL5 compatibility when building RPM packages
5287e2b
[maven-release-plugin] prepare for next development iteration

Release 2.11.36

Changes affecting multiple services

Eliminate race condition that can lead to a NullPointerException if a cell does not shut down cleanly.

The dcache script no longer checks whether the hostkey.pem file is in PKCS#8 format when dcache stop is invoked. Previously this could lead to orphaned dCache domains; for example, when upgrading dCache RPM.

admin

Fix potential IndexOutOfBoundsException should the response from the acm cell be malformed.

pnfsmanager

Contribution from Kurchatov: update the ChimeraCleaner to work-around the Java compiler’s inability to produce compatible Java 7 binaries when compiling with JDK 8. Note: sites using official dCache.org packages do NOT need to upgrade as a result of this change.

pool

Fix an intended pool-to-pool transfer optimisation: the receiving pool failed to reuse a delayed mover, should the pool-to-pool request timeout and be retried.

Fix pools so that they do not log NullPointerException if a pool receives a request to restore a file from an HSM to which it has no access.

Scripts

Fix how the writedata command in the chimera shell accepted data: the command-line argument was ignored if supplied and data was taken from stdin, if no argument was supplied then the command would fail with NullPointerException. Note: this command does NOT write data into dCache, but into Chimera.

srm

Fix the context information included when logging failures to write job information to the database.

Changelog 2.11.35..2.11.36

f00f303
[maven-release-plugin] prepare release 2.11.36
9e1d0e2
srm-client, dcache: fixed passing incompatible arguments to functions
57e8e75
dcache: removed unecessary use of non-short-circuit logic
f311296
pool: Fix NPE when restoring file
8cf34a9
ChimeraCleaner: reallow to be compiled/run on different Java versions
ff3c1c9
scripts: do not check for PKCS#8 formatted hostkey.pem on shutdown
d56bf7a
chimera: Null value passed to non-null parameter in org.dcache.chimera.cli.Shell$WriteCommand.call()
cbe3e69
srm: Use correct logging context when saving jobs
23c82bb
cells: Fix NPE during shutdown
cfd0e9a
[maven-release-plugin] prepare for next development iteration

Release 2.11.35

Changes affecting multiple services

When starting up, all doors (dcap, ftp, nfs, srm, webdav, xrootd) and pools advertise their presence to other dCache components before they are able to handle incoming requests. This can lead to subsequent queries timing out as the service finishes starting up. With this version of dCache, doors and pools only advertise their presence once they can handle incoming requests.

pool

This release updates how dCache configures the Berkeley DB when used for storing pool metadata. In addition, dCache will now no longer disable the pool when suffering a Berkeley DB-related problem if the Berkeley DB environment is still valid. Combined, these two changed should greatly reduce the occurances of pools disabling themselves when under heavy IO load.

Changelog 2.11.34..2.11.35

a10d104
[maven-release-plugin] prepare release 2.11.35
3b3ba40
pool: Refine Berkeley DB failure handling
1ed3b48
Don’t announce cells to other services until they have started
aa64225
[maven-release-plugin] prepare for next development iteration

Release 2.11.34

Changes affecting multiple services

The ftp, webdav and xrootd doors will delete the target file if an upload was unsuccessful. The copy manager (part of the transfermanagers service) has a similar behaviour if an internal copy is unsuccessful. If this delete was unsuccessful (e.g., the client deleted the file itself) previous dCache versions would log this at ERROR level. With this dCache version, such occurrences are logged at DEBUG level.

nfs

Fix race condition that can occur when a pool is first accepting pNFS transfers if multiple requests are processed almost simultaneously.

webdav

The webdav door has separate configuration allowing the admin to configure the door-local path that contains site-local files and the URI prefix to access those files. Earlier versions of dCache mistakenly used the former for the latter, which this release fixes.

Changelog 2.11.33..2.11.34

96732e6
[maven-release-plugin] prepare release 2.11.34
7472ae6
Revert “webdav: Add robots.txt”
ded846d
webdav: Respect webdav.static-content.uri property
f083d2f
doors: Do not log failure to delete absent files on upload failures:
8349731
nfs4: fix race in request processing
c61fb66
webdav: Add robots.txt
46c3745
[maven-release-plugin] prepare for next development iteration

Release 2.11.33

Changes affecting multiple services

In earlier versions of dCache, the code-base would always establish a node’s FQDN through a DNS query on start-up. For some services and for dCache scripts (for example, the chimera script), this information is not used. With this release, dCache only makes the DNS query when it is necessary, so domains hosting services that do not need this information and scripts will start faster.

dcap

Doors describe their root path to SRM so it can calculate appropriate TURLs. Previous versions of dCache had dcap doors register incorrect paths, which this release fixes.

dCache configuration allows an admin to control if certain ciphers are allowed. In particular, this allows sites to remove support for problematic ciphers or hashing algorithms. This release fixes a problem where the GSI-dcap door failed to honour such settings.

pool

The replica-manager periodically requests a list of file replicas that a pool is hosting. In previous versions of dCache, if the pool finds a broken file then an error is returned to replica-manager. The replica-manager then considered the entire pool as being offline. With this version od dCache, such errors are logged on the pool. The replica-manager will not consider the pool as hosting that file’s data, but will otherwise consider the pool online.

The different HSM operations (flush, stage and remove) have internal timeouts after which the pool considers the request as failed. In previous versions of dCache, the default pool setup includes a four hour timeout for flush and stage but neglected to set a default for delete. This omission caused delete operations to time-out very quickly. With this release, delete operations also have a default of four hours.

Fix the Ruby implementation of the hsmcp script (hsmcp.rb) so it can parse new command-line arguments that include concurrency options.

The concurrency for active HSM operations is configurable and may be adjusted dynamically. In earlier versions of dCache, decreasing the concurrency only became effective when that operation started to idle. This has been fixed so the limits start to have an effect as operations complete.

Each movers can have one of three priority (LOW, MEDIUM, HIGH) and a selection discipline (FIFO or LIFO). The documented behaviour was for queue with names that start with - have LIFO discipline and those that start with any other character have FIFO discipline. Due to a bug, the order was wrong, with the priorities inverted and the two disciplines swapped, so LOW priority movers were started before MEDIUM level and MEDIUM were started before HIGH. This release fixes this so HIGH priority movers are selected preferentially over MEDIUM and, MEDIUM priority movers are chosen over LOW priority; however, it was decided to keep the disciplines as in previous versions and updated the documentation accordingly. There are several reasons for this: first, there is no difference between LIFO and FIFO when movers are not queued; second, neither discipline will help if the pool is persistently overloaded; third, LIFO discipline (although unfair) is documented as providing a better overall throughput during a time-limited overload; fourth, by default dCache has been running with LIFO discipline since v1.9.11 (released 2011–01–13) without any apparent problems.

HTTP third-party transfers report back if there was a problem verifying that the transfer was successful. One possible problem is that the remote server failed to supply checksum information. Reporting of such situations is now fixed.

webdav

In previous versions of dCache, should a user cancel a transfer shortly after a mover is created then there was a risk that the mover is abandoned. This is fixed with this release.

xrootd

In earlier versions of dCache, if the xrootd door times out for a write request while waiting for the mover to send the redirection information then the mover is abandoned. This is fixed with this release.

Changelog 2.11.32..2.11.33

3073f9f
[maven-release-plugin] prepare release 2.11.33
194a072
hsmcp: update to match new HSM interface.
cbd0d90
(2.11) old replica manager: prevent pool being listed as offline when there are files with corrupt metadata
faf8dd4
webdav: Kill abandoned movers
bd8ed51
xrootd: Kill mover on aborted write
17aac10
pool: Fix transfer prioritization
719fe09
pool: Add nearline storage default timeouts
4c1fc64
pool: Let script nearline storage provider scale down when lowering limits
2af9726
dcap: Fix broken argument parsing
903ecd2
Fix build and startup regression
1a6ab63
dcap: Fix socket factory argument parsing
519ad35
pool: Fix error reporting in remote HTTP mover
d243819
prepare for next development iteration

Release 2.11.32

srm

Fix security vulnerability in srm EGI-SVG–2015–9495 (restricted).

Release 2.11.31

Changes affecting multiple services

In many cases, poolmanager would timeout after ten seconds when asked which pool to use for a transfer. This behaviour was not intended. The consequence of this bug is protocol specific: for some protocols, the door retries internally while other doors propagate this error to the client. Another consequence was the increased risk of the domain hosting poolmanager running out of memory, particularly when staging files. This release fixes the underlying problem. It is recommended that all doors be upgraded.

Fix a performance regression when deleting directories; the fix affects the pnfsmanager and nfs services.

spacemanager

Fix bug that can result in leaked entries in space-manager file management from failed uploads. The problem is most likely triggered when a client cancels an FTP upload at the same time as the correponding SRM upload request expires. The problem may also be triggered by communication failure with PnfsManager and the user deleting the failed upload before the pool retries.

srm

Fix support for credential delegation if the credential’s certificate is sent over multiple SSL frames. This most likely happens when the certificate exceeds the maximum frame size.

Changelog 2.11.30..2.11.31

8db6850
[maven-release-plugin] prepare release 2.11.31
774e283
spacemanager: Fix race condition leading to leaked reservation entries
23cd4ac
chimera: Resolve performance regression in directory deletion
2667ae5
doors: Fix pool selection timeout handling
cc588cf
Fix timeout math to avoid overflow
1ea1a31
srm: Fix credential delegation
a1b6dd1
Preparing for next release cycle

Release 2.11.30

ftp

Fix security vulnerability in gsi and kerberos authenticated ftp.

Release 2.11.29

Changes affecting multiple services

The System cell of each domain contains a version command that allows discovery of which dCache version is running. This release fixes this command. Note: there is no problem with the dcache script’s version command.

nfs

The NFS protocol provides access to additional infomation through dot commands. This release fixes the nameof and pathof commands for non-ASCII filenames.

pnfsmanager

Fix that dCache respects the setgid bit on a parent directory when the user uploads a file via the SRM protocol. Important: the srm node should be updated at the same time.

srm

Enforce authorisation of requests to finalise or cancel an upload. When initiating an upload, the user’s uid and primary gid are taken as the request owner-uid and owner-gid respectively. Only users that have the same uid as the request’s owner-uid or are a member of the request’s owner-gid are allowed to cancel or finalise an upload. Important: the srm must be updated if the pnfsmanager is updated.

webadmin

This release fixes a bug with the periodic building of billing plots. Previously, if the billing service took too long to reply then there would be no further updates to the billing plots.

Changelog 2.11.28..2.11.29

5b60c47
[maven-release-plugin] prepare release 2.11.29
e80fad6
rpm: remove “commented out” macros lines from spec file
c68d33d
chimera: fix nameof and pathof for paths containing unicode
ec9e87f
chimera: Let SRM respect setgid on upload
a784ba4
srm: Add authorization to put done and abort requests
daaacd0
module: cells
f37db11
(2.11) dcache-webadmin: add TimeoutCacheException to catch clause in billing service
22c54d3
[maven-release-plugin] prepare for next development iteration

Release 2.11.28

Changes affecting multiple services

Specifying the DISABLE_BROKEN_DH flag in the dcache.authn.ciphers configuration property disables all Diffie-Hellman ciphers if Java 7 is used; if dCache is run with Java 8 then this flag has no effect. Disabling DH ciphers is necessary because Java 7 contains a broken implementation of Diffie-Hellman, which was fixed with the release of Java 7 update 51. This dCache release updates the behaviour of the DISABLE_BROKEN_DH flag to allow Diffie-Hellman ciphers if dCache is run within Java 7 update 51 or later.

The description of the DISABLE_EC and DISABLE_RC4 flags have been expanded and updated.

infoprovider

An earlier patch added support for publishing a single dCache instance with multiple SRM endpoints. This provided incompatible with sites that use a DNS alias for their official SRM endpoint, so that change is reverted with this release. Support for publishing multiple SRM endpoints is available with dCache v2.13.

nfs

Add support for the TEST_STATEID and FREE_STATEID RPC methods. These are used by the Linux kernel during recovery procedure. The previous lack of support for these methods can lead to the leaking of stateids, which can lead to NFS4ERR_RESOURCE : Too many states being logged.

pool

Improve the error message (logged by the pool and in billing) should the FTP mover fail to connect to the client.

This release updates the error the pool reports to an NFS client when the client attempts to read or write and the pool cannot find the mover. This situation is mostly likely caused by restarting the pool. When it receives the modified response, the client will fall back to using proxy-IO. This allows NFS clients that were reading a file to survive a pool restart.

Changelog 2.11.27..2.11.28

77e6e79
[maven-release-plugin] prepare release 2.11.28
bdff7c7
Fix broken commit daeb9ed9
daeb9ed
pool: fix ftp mover to provide better logging when failing to connect
44f1d46
crypto: refine handling of broken ciphers
22a8c11
Revert “infoprovider: remove single SRM instance limitation”
a492eb4
libs: update to nfs4j–0.9.7
24c9981
pool: report IO error if we cant find NFS mover
f8b4dd5
[maven-release-plugin] prepare for next development iteration

Release 2.11.27

Changes affecting multiple services

The event logger records when messages are received and sent by cells. Some cell messages send string commands; if so, the log entry contains that string. In previous releases, such string commands were mistakenly double-escaped. This is fixed with this release.

ftp

Fix debug output to include the flavour of GSS implementation; for example, GssFtpDoorV1::secure_reply: going to authorize using k5

pnfsmanager

When a user attempts to delete a symbolic link using a non-NFS door, previous versions of dCache would resolve the symbolic link to determine whether the user is allowed to delete the symblic link: only if the user is allowed to delete the symblic link’s target would the symbolic link be removed. With this release, the check verifies if the user is allowed to delete the symbolic link instead.

pool

This release updates the JVM command-line to make it explicit that compressed object references are in use. This allows the Berkeley DB library to calculate a more accurate cache size, potentially improving pool performance.

In previuos releases, any attempt is made to query a pool’s info (e.g., via the admin interface) while the pool is initialising will block until the initialisation has completed. This has a knock-on effect of blocking all subsequent messages. With this release, requesting information about a pool will not block during initialisation.

Fix high memory usage during pool initialisation if pool has any precious files.

xrootd

Earlier releases will record a stack-trace if xrootd recieves a malformed request. This is now fixed.

Changelog 2.11.26..2.11.27

8642172
[maven-release-plugin] prepare release 2.11.27
d517382
cell: Don’t quote string command in event logger
c671209
pool: Explicitly enable compressed oops to calculate correct cache size
d1b465e
pool: Fix locking bug causing high memory usage during pool initialization
f0cdbb4
chimera: Fix path resolution on delete
312606d
xrootd: Fix ‘xrootd logs stack-trace on malformed request’
d6cb4fc
move execution of the superclass method before any concrete class initializations
f6042b8
pool: do not list a repository during initialization
dc93f7e
[maven-release-plugin] prepare for next development iteration

Release 2.11.26

pool

The Berkeley DB based metadata storage can sometimes fail. Should this happen, the pool must be restarted. In previously releases, such problems were logged with an unclear message and a stack-trace; the pool would continue to operate but would fail all subsequent transfers; nothing made it clear the pool must be restarted. With this release, such Berkeley DB problems will be logged with a concise error message and the pool will be disable, making the restart requirement explicit.

Fix regression in the migration module that could caused more replicas to be created than was desired and could cause replication to fail with a No targets error message.

srm

In previous versions of dCache, the ls operation in SRM occationally returned incomplete or inconsistent results. This is now fixed.

Changelog 2.11.25..2.11.26

e3f3d89
[maven-release-plugin] prepare release 2.11.26
e5c0b9a
pool: Fix race condition in migration module
51b771e
srm: fix race condition in ls response
cb125c4
pool: Disable pool on meta data failures
9839cee
[maven-release-plugin] prepare for next development iteration

Release 2.11.25

Changes affecting multiple services

In previous releases, dCache required a layout file be present, even if that file was empty. This has two negative impacts: the dcache stop (typically invoked automatically when updating a package) would not work, nodes where only scripts (e.g., info-provider) are used would require unecessary configuration. With this release, a missing file generates a warning but does not prevent the dcache stop command or scripts from working. This warning may be suppressed by setting dcache.layout.uri to an empty string.

httpd

The tape related queues (flush and restore) have no maximum limit, yet both the old web information and new webadmin show a dummy maximum value for these queues. This meaningless maximum value is no longer shown.

info

The info service collects information about, amongst other things, the interface(s) a door listens on. This is made available in different formats. The URL-formatted version, used by info-provider, always used an IP address even when a name was known. This is now fixed.

infoprovider

It is possible to run multiple SRM endpoints in dCache, provided certain restrictions are upheld. With this release, the info-provider publishes multiple SRM endpoints correctly.

nfs

Some shells, when attempt to overwrite a tag’s content using NFS, do so in a way that Chimera previously failed to support. This failure was reported back to the user as a remote I/O error. This release fixes this problem.

pool

Reduce latency when a pool processes a request to start a read mover. This improves dCache responsiveness when a client opens a file for reading.

The NFS mover uses the file’s size when processing read and write requests. For read operations, the file size cannot change. This release takes advantage of this to reduce the load on the underlying filesystem.

poolmanager

Poolmanager may attempt to create additional copies of a file, only to discover such attempts fail because of other constraints. This leads to the log file containing entries like P2P denied: already too many copies and P2P denied: No pool candidates available/configured/left for p2p or file already everywhere. With this release, such entries are logged at info level: they no longer appear in the log file, but are available via the poolmanager’s pinboard.

webdav

Fix proxied uploads when the client does not send the file size, either directly or via SRM; in particular, this fixes compatibility with ARC.

Changelog 2.11.24..2.11.25

917be16
[maven-release-plugin] prepare release 2.11.25
94e4959
systemtest: fix install command in credentials command
9448a43
infoprovider: remove single SRM instance limitation
cbd26fe
info: fix url-name to publish hostname
1b9a8fb
chimera: throw FileExistChimeraException if tag already exists
67b1fac
(2.11) webadmin: do not display numerical value for max restores or stores
8efb237
boot: Don’t fail on missing layout file
56a092c
httpd: Do now show maximum for restore and flush queues
83942e5
poolmanager: Reduce log level of p2p denial
d02a1c9
webdav: Fix proxied upload with chunked encoding
5f856dd
pool: simplify duplicate request handling
90255bc
[maven-release-plugin] prepare for next development iteration
f0a1de3
pool: reduce load on back-end file system

Release 2.11.24

Changes affecting multiple services

A new security configuration option allows the dCache admin to ban all SSL/TLS ciphers that use the RC4 cipher. RFC–7465 states services MUST NOT accept an RC4-based cipher suite. Adding the DISABLE_RC4 option to the dcache.authn.ciphers makes dCache compliant with RFC–7465. This option is not enabled by default to avoid possible regression with clients that require the RC4 cipher. This property affects dcap (with GSI), ftp (with GSI), srm, webadmin, webdav (with SSL/TLS), xrootd (GSI plugin) services.

pnfsmanager

Fixed ACL inheritance when uploading data through SRM. In earlier versions of dCache, a file uploaded through SRM failed to inherit any inheritable ACEs from the parent directory.

This release brings some modest performance improvements when creating upload directories. This improvement is available automatically only to sites that have not yet upgraded to 2.10 (or later). Sites already running 2.10 or later can enjoy the same improvements by deleting the upload directory (/upload by default) to allow dCache to recreate it. Important: deleting the upload directory will fail any current SRM uploads; it is recommended to do this during down-time.

pool

The pool’s migration module may be invoked with different pool selection modes: the -select option. The random selection option (-select=random) excludes pools that are full, but mistakenly considers replicas that could be deleted (i.e., non-sticky cached replicas) as part of the used space; this treats a pool as full even when the pool has removable files. With this release, pools that are full but contain some cached files are potential targets for random pool selection.

In earlier releases of dCache, the save command failed to record the stage, flush and remove timeouts for nearline storage (rh set timeout, st set timeout, rm set timeout respectively). This is now fixed.

This release introduces the pool.mover.nfs.port.min and pool.mover.nfs.port.max configuration properties. Previously, pools listened on a random port between dcache.net.lan.port.min and dcache.net.lan.port.max; the two new configuration properties take the two dcache... configuration properties as default values. Once the pool is listening on a particular port, it will try to listen on the same port after restarting. If this proves impossible, another port is selected and will be used subsequently. Important: using the same port is important as the pool listening on another TCP port can trigger high load on the client machine.

Scripts

Fix JAR selection when a short-lived java process is started. This is typically done when using one of the scripts.

spacemanager

Fix writing into a reservation when using a protocol that does not provide a username or FQAN; for example, when writing into dCache using NFS and with the WriteToken directory tag set. Previously writing would fail with a Message processing failed: null group message.

srm

The srm service can generate RemoteException stack traces when dCache is behaving correctly. These are now logged at debug level and without a stack trace.

Changelog 2.11.23..2.11.24

0b576ed
[maven-release-plugin] prepare release 2.11.24
35c3355
chimera: Fix merge conflict and Java 7 compatibility
404da75
pool: introduce unique port number for nfs mover
81722be
pool: dedicated port range for nfs
54c3c4f
pool: Store nearline storage timeouts to pool setup file
3f52cb3
pnfsmanager: Create base upload directories without tags and acls
d8062ec
pnfsmanager: Inherit ACLs on upload with SRM
178ad45
chimera: Add ACL insert triggers for HSQLDB
f3cbbb7
Fix limited class path generation
bbea900
pool: Let random pool selection select pools with removable files
64fe463
crypto: allow banning of RC4 cipher suites
71c7042
spacemanager: Allow unowned files in reservations
f760652
srm: Don’t log erroneous stack trace
0704371
[maven-release-plugin] prepare for next development iteration

Release 2.11.23

Changes affecting multiple services

This release upgrades the BouncyCastle version from v1.45 to v1.46. The main motivation is to improve concurrency, so obtaining better performance on multi-core machines. The update affects the xacml and voms plugins to gPlazma and any door configured to use (or that always uses) GSI authentication: dcap, ftp and srm. There is no cross-dependency; domains hosting these service may be updated independently.

pnfsmanager

Fix possible leaked upload directory if PnfsManager takes too long to create the upload directory.

scripts

The chimera script provides both an interactive shell and the ability to run a Chimera operation as a single command-line invocation. This release fixes an problem where the single Chimera operation fails to provide the output if it is too short.

srm

If a client releases a reservation using the SRM protocol while another SRM client is querying for information about that reservation then there is a risk that the reservation will appear to exist for some 30 seconds, despite the reservation being successfully released. This release fixes this problem.

Changelog 2.11.22..2.11.23

4722ed6
[maven-release-plugin] prepare release 2.11.23
ebe7b56
srm: Fix cache invalidation of space meta data
151cba5
libs: Update to voms-api-java 2.0.10.1
7e1ba45
libs: use bouncycastle–1.46
a9cf314
libs: update jglobus to 2.0.6.9d
b9de736
chimera: Fix single command invocation of chimera utility
37a32aa
pnfsmanager: Resolve upload directory leak caused by missing reply flag
7f76d76
[maven-release-plugin] prepare for next development iteration

Release 2.11.22

Changes affecting multiple services

dCache uses a standard format to monitor the performance of various components: in the srm door to record how quickly SRM requests are processed (print srm counters command), in the nfs door and pool to monitor NFS performance (stats and nfs stats commands, respectively), the generic cell message monitoring (monitoring info command), and pnfsmanager service (the “Statistics” section in info). This release fixes a rounding error that prevents these statistics from including long-lived requests.

pinmanager

The pinmanager service has the ls command that allows the admin to limit the results to a specific pin or all pins against some PNFS-ID. This release fixes listing by pin id.

pnfsmanager

With this release, the path of automatically generated upload directories has changed slightly to improve SRM upload performance. Previously, these generated paths had the form <upload>/<unique-ID>, where <upload> is the value of the dcache.upload-directory configuration property (/upload by default) and <unique-ID> is some unique value (a UUID). With this release, these directories have the form <upload>/<processor-ID>/<unique-ID>, where <processor-ID> is some small integer value. Standard-conforming clients are unaffect by this change and no action are needed by the admin from this change.

Update pnfsmanager to avoid creating temporary directories if the srm has already discarded the request. This helps dCache recover more quickly when it is overloaded from SRM uploads.

pool

Fixed pool’s erroneous interpretation of the HSM timeout (4 hours, by default) as being from when a staging request was initially received, rather than from when it started processing the request (by starting the HSM script or via the new plug-in mechanism).

This release fixes how a pool’s invokates the HSM script. Previously, the pool mistakely omitted the additional arguments that an admin may configure the pool to include.

replicamanager

This release fixes the Can't clear the tables error reported when starting replicamanager service with certain PosgreSQL versions.

spacemanager

Fix the update space admin command so it can remove any ownership from a reservation. The reservation’s owner is allowed to release the reservation via the SRM protocol. If the reservation has no owner, it may only be released through the admin interface.

srm

The info admin command provides details about the srm service, including information specifically about SRM activity that can be processed synchronously or asynchronously: get, put, reserve-space, ls, and bring-onling. This information describes how many requests are in each of the possible states, one of which is Waiting for CPU. This release fixes the output of the info command to show the correct Waiting for CPU values.

This release drastically improve srm service performance when there are a large number (e.g., thousands) of queued requests; for example, this brings considerable improve for bulk bring-online requests.

This release also brings improved performance when the SRM has only a few queued requests; more specifically, when the number of queued requests is less than or equal the number of currently unoccupied max-inprogress slots. For example, if dCache is not processing any GET requests then the first srm.request.get.max-inprogress GET requests are processed more quickly.

Fix the srm service switching from synchronous to asynchronous processing when processing a bulk-requests with many files; in earlier releases, such large requests could prevent a request from falling back to asynchronous processing.

This release fixes the srm service so it cancels corresponding pinmanager requests when an SRM client aborts a bring-online request.

Changelog 2.11.21..2.11.22

4c59d41
[maven-release-plugin] prepare release 2.11.22
9c39979
(2.11) replica manager: fix table truncation
e4c93a1
common: Fix division by zero regression in gauges
330e50c
common: Fix rounding error in request gauge
1f7c373
pnfsmanager: Discard upload path creation request on TTL expiration
f563128
pnfsmanager: Use per-thread upload directory to reduce lock contention
7f0e94d
srm: Optimize scheduler performance
1fddcc2
srm: Further optimize SRM scheduler
00bf725
srm: Abort pinning when cancelling bring-online requests
cdb1c45
system-test: Enable MVCC and logging for HSQLDB
6a9e497
srm: Fix sync to async mode timeout
294677c
pinmanager: Fix listing by id
81e5207
pool: Add HSM options to hsm script remove callout
db48f74
srm: Fix queue size reporting
bbda848
pool: Fix timeout behavior of HSM requests
7c20abd
spacemanager: Allow spaces to become unowned
5974ee0
[maven-release-plugin] prepare for next development iteration

Release 2.11.21

alarms

Update output of the predefined ls admin command.

nfs

There was an error in how ACLs were interpreted by the nfs door where multiple ACEs for the same user were compacted ignoring the flags. With this release, ACEs against the same user but with different flags are honoured.

Changelog 2.11.20..2.11.21

6b70d06
[maven-release-plugin] prepare release 2.11.21
64dd236
cli: remove stray separator in predefined alarms printout
092e99d
libs: update to nfs4j–0.9.6
fc05e23
[maven-release-plugin] prepare for next development iteration

Release 2.11.20

Changes affecting multiple services

This release fixes an NFS problem triggered when a second process opens a file that is already opened by another process on the same computer. The NFS specification allows the second open to reuse the “layout” (== mover) from the first open. With this release, the door will detect this and reuse the existing mover; the mover is updated to allow this. The nfs door and all pools accessible via the NFS protocol must be updated to this release or newer.

dcap

No longer log a stack-trace when transfering a file if the client is expected to connect to the pool (the “active client mode” option; the -A in the dccp command) and the pool is shutdown before the client connects.

srm

Some SRM requests support bulk requests, where multiple SURLs are processed in the same fashion. The assumption was that the SURLs in a bulk request are distinct. Bulkd requests have been observed that violate this assumption: a request with a SURL appears more than once. This causes problems when staging files from tape. With this release, a SURL that appears multiple times in a request is processed exactly once.

Changelog 2.11.19..2.11.20

ad3b0c0
[maven-release-plugin] prepare release 2.11.20
85217e0
dcap: fix stack-trace when shutting down pool waiting for connection
4186cfd
system-test: add regenerated host and user credentials
d321114
srm: Remove duplicate SURLs in bringonline and get requests
40d4d2a
system-test: add missing dCache disposible CA certificate
5631940
[maven-release-plugin] prepare for next development iteration
dbd0fb4
nfs: share mover for the same client
ae78db5
nfs: use NFSv4MoverHandler instead of Map in embedded NFS server

Release 2.11.19

Changes affecting multiple services

In previous versions of dCache deleting a directory with the last reference to a tag did not remove that tag. With the introduction of upload directories this problem became acute, as many such directories are created and deleted. With this release, deleting the directory with a tag’s last reference is very likely (but not guaranteed) to remove that tag’s inode. Further information about this will be made available via user-forum. The update affects both the nfs door and pnfsmanager; however, most sites will see the most benefit from updating pnfsmanager. Important #1 an updated nfs door will wait for database changes that updating pnfsmanager will enact. Important #2 the database changes enacted by pnfsmanager can take awhile (an hour or so for large dCache instances) as the change adds an index to the t_inodes database table.

The “standard” Linux flag to mark an ACE as inherit-only is i, yet previous versions of dCache accepted only ‘o’ as the inherit-only flag. With this release, both o and i are accepted. The chimera shell, pnfsmanager and nfs door are affected; nodes hosting these services and scripts may be updated independently.

admin

Restore compatibility with loginbroker information for pcells; in particular, the problem affected information provided by the srm door.

dcap

Previously the dcap door provided the dcap client with the wrong errno (error number) should the client attempt to operate on a nonexisting file or directory: EIO was returned instead of ENOENT. This is now fixed.

pool

The file integrety checking (single file, entire pool one-off check and background checking) produced ambiguous output; for example, not being able to scan a file because it is still being uploaded counted towards an “error” count, but no corresponding log message is included. With this release, the output is less ambiguous and distinguishes between corruption, temporary and more permanent problems.

replicamanager

Added an admin command to query pool-manager for an updated list of pools that are a member of the resilient pool group. This allows adjustments to the set of pools participating in replication without restarting the replicamanager service.

Changelog 2.11.18..2.11.19

65145d6
[maven-release-plugin] prepare release 2.11.19
0613e1f
replicamanager: add an admin command to re-fetch resilient pool group
855206c
acl: fix compatibility with linux ace
4969347
chimera: Delete unreferences tag inodes
d620902
admin: Restore pcells compatibility with loginrbroker
533655e
pool: update scrubber messages to be less ambigous
ea03937
dcap: fileAttributesNotAvailable must set pass ENOENT to the client
6604756
[maven-release-plugin] prepare for next development iteration

Release 2.11.18

Changes affecting multiple services

In prior versions of dCache, the path field that billing logs for transfers contained the actual transfer path; i.e., for SRM-initiated uploads this was the auto-generated path from the TURL and not the user-supplied path from the SURL. This proved confusing so, with this release, the path field now has the user-supplied path (i.e., from the SURL). An additional field (transferPath) has the transfer path. While this is not logged by default, billing configuration (the billing.text.format.mover-info-message and similar properties) may be updated to include it. All doors used by srm clients and the billing service should be updated.

dcap

This version of dCache uses the connected socket when discovering the IP address of the client for “channel binding”, rather than a constant value. This is important for dual-stack machines (with both IPv4 and IPv6 addresses) that host dcap doors with either gsi- or kerberos-based authentication.

pool

Fix parsing of certain HSM-related error messages that would previously have failed with an index out of bounds error.

spacemanager

In earlier versions of dCache, if a file is deleted while being uploaded the space-manager will consider the transfer as successful (and so using some of the reservation’s capacity) whereas the pool would delete the file immediately after the upload is complete. Such “leaks” results in a reservation being reported as having more used capacity and less free capacity than it should.

srm

The srm service maintains a counter of the number of requests for each different type. With earlier versions of dCache, if the srm encounters jobs that have timed out while the service was no running, these counters became inaccurate. This is now fixed.

When an srm request times out, dCache may need to take some action to “clean up”; for example, removing the upload directory. If srm is configured to discard all requests on start-up these cleanup operations did not happen. This is fixed with this release. As a consequence, startup times will be longer.

If srm.persistence.enable.store-transient-state is set to false (the default value) then transfer requests do not survive an srm service restart if the limit on concurrent transfers prevented handing out the TURLs. With this release, such requests survive an srm restart.

Previously, if srm.persistence.enable.store-transient-state is set to false (as is the default) then information needed to clean up a job might not be stored in the database. If, after restarting the srm, the job times out or is aborted then there is insufficient information to clean up, resulting in upload directories not being cleaned, pins (for bring-online requests) not being cleared, copy requests not being cancelled and lifetime extensions being lost. These problems are fixed with this release.

Changelog 2.11.16..2.11.18

e9f5d41
[maven-release-plugin] prepare release 2.11.18
f702e5f
all: Fix several NPEs when submitting billing messages
4afd8db
[maven-release-plugin] prepare for next development iteration
8ae5f7b
[maven-release-plugin] prepare release 2.11.17
0b38d78
srm: Fix scheduler counter initialization on restart
d9f7517
doors: Log real path in billing
0dc26ff
srm: Allow the SRM to take action upon cleaned requests
9439e9a
srm: Force save jobs when adding information needed for cancellation
89be30a
srm: Force save when job becomes RQUEUED
4cca91b
pools: fix parsing error in HsmRunSystem
9d643e2
javatunnel: use connected socket to discover local inet address
47d1a4b
spacemanager: Fix race condition leading to leaked files
87c1c6c
[maven-release-plugin] prepare for next development iteration

Release 2.11.16

Changes affecting multiple services

When dCache suffers a sufficiently high inrush of requests that a cell’s request queue is exhausted, new requests to that cell are rejected. While overall dCache handles this situation correctly (degrading gracefully), the internal message counting and event logging were not updated correctly. This is now fixed.

Fix a bug that triggers a NullPointerException in webadmin. Although the problem is logged in webadmin, the cause is in the doors supplying the information. Therefore, the dcap and ftp doors are to be updated.

When publishing an IPv6 interface, do not include the zone information. Zone information is an extension to IPv6, which appends a ‘%’ plus some opaque identifier to the regular address. Not all clients understand this extension and reject the address as invalid. As zone information is not useful anyway, with this release zones are no longer published. This update is for all doors.

Fix possible NullPointerException in httpd and admin services.

In most cases dCache will publish door URLs with a hostname; if the address cannot be resolved then the address is used instead. Previously IPv6 addresses were written incorrectly: without the square brackets. This is now fixed if the info service and the info-provider scripts are updated.

admin

Fix division-by-zero error when the SSH client reports zero width or height. This happens forcing allocation of a pseudo TTY without having a real TTY (see the -t option to OpenSSH client).

pinmanager

Fix pinmanager so it does not log a stack-trace if it failed to fetch fresh pool status information from poolmanager while unpinning a file.

pool

Previously, the metadata reconstruction for files where the upload was not completed and written without the client specifying a retention policy would trigger a stack-trace java.lang.IllegalStateException: Attribute is not defined: RETENTION_POLICY. This is now fixed.

spacemanager

A number of sites have reported problems with the database resolving deadlocks. While such reports are expected and dCache behaves correctly when they happen, this release should reduce the likelihood of them appearing.

Changelog 2.11.15..2.11.16

6f73669
[maven-release-plugin] prepare release 2.11.16
d25d430
info/info-provider: publish valid IPv6 addresses
38f902c
loginbroker: strip off zone off published interface name
23c8f0f
admin: Avoid division by zero when the client reports a zero sized terminal
9cbe06f
spacemanager: Optimize space record deletion
23c4fe0
dcache: Make cell communication use the correct timeout
c5416a2
httpd,admin: Fix NPE in transfer collectors
f01e67c
doors: Fix race condition that causes NPE in webadmin
7b9f5f0
cells: Fix event queue counting bug
21d1f04
pinmanager: Don’t log stack trace when unable to fetch pool monitor
c9360d0
pool: Fix meta data reconstruction
b000b90
[maven-release-plugin] prepare for next development iteration

Release 2.11.15

Changes affecting multiple services

Avoid potential message loop when both sender and receipient domains are restarted while the message is in-flight.

admin

pcells distinguishes between a failure to send a message to a cell and that cell taking too long to respond. Support for that distinction was lost with 2.10; with this release, it is restored.

The tab-completion feature of admin parses the help-hint to discover what expansions are available. This has been updated to support more commands.

httpd

Fix compatibility with pcells. This requires a corresponding update to pcells.

Changelog 2.11.14..2.11.15

5ed59c4
[maven-release-plugin] prepare release 2.11.15
19d31f6
admin: Fix command completion
6046451
admin: Propagate NoRouteToCellException to pcells
a549cd7
cells: Restore CellExceptionMessage encoding
90c9976
httpd: Fix pcells compatibility
c975871
[maven-release-plugin] prepare for next development iteration

Release 2.11.14

admin

Restore support for pcells to gracefully handle timeouts.

Restore compatibility for pcells when querying space-manager.

cleaner

Fix bug reported as java.lang.ClassCastException: java.lang.String cannot be cast to [Ljava.lang.String;.

nfs

Provide a better cache of the FsStat information. Amongst other things, this improves the responsiveness of the df command.

pool

Fix pool reconstruction. If, on startup, the pool detects it is in an inconsistent state it will attempt to recover from that. With this release, this should work.

srm

Update SRM default property values. From various reports, it is clear that the current default values are not ideal for many sites. The following properties are adjusted:

  • srm.limits.jetty-connector.backlog increased to support bursts of activity; a larger value may be appropriate but not available due to default Linux configuration.

  • srm.request.threads reduced as processing is asynchronous.

  • srm.request.ls.threads change default to be srm.request.ls.max-in-progress as ls requests are blocking.

  • srm.request.max-requests increased to satisfy user-demand; needed as clients typically request more concurrent TURLs that they make concurrent requests.

  • srm.request.max-transfers increased to same value as srm.request.max-requests. This way (by default) dCache never blocks requests pending a client returning a TURL.

  • srm.persistence.remove-expired-period increased to 10 minutes to reduce stress on the database.

  • srm.service.pnfsmanager.timeout decrease to two minutes as PnfsManager should respond within that time and clients will likely disconnect if they don’t hear a response within that time.

  • srm.service.spacemanager.timeout decrease to 30 seconds as the service should respond quickly.

  • srm.protocols.disallowed.get and srm.protocols.disallowed.put now include file protocol by default.

  • srm.protocols.loginbroker.timeout decrease to 20 seconds as this is a very light-weight service.

Support catching and logging some bugs that previously would be silently ignored.

Fix a ConcurrentModificationException caused when SRM processes two requests close tegether.

Changelog 2.11.13..2.11.14

fed09f8
[maven-release-plugin] prepare release 2.11.14
877a5ae
admin: Restore pcells compatibility
3624857
srm: log more SRM bugs
d16a878
admin: Restore timeout semantics for pcells compatibility
d6d5d46
cleaner: Fix class cast exception
3b9b54b
srm: Use more sensible default values
0177d04
srm: Fix ConcurrentModificationException in Axis
9469e73
pool: Fix pool entry reconstruction
cf51c39
[maven-release-plugin] prepare for next development iteration
a03e827
chimera: do not maintain time-based cached value of FsStat

Release 2.11.13

Changes affecting multiple services

This release improves the quality of information recorded when logging problems and fixing synchronization of lease time. This affects the nfs door and pools.

pool

When a client reads a file, the pool reads blocks of data from the local filesystem. When reading such a block, the pool could receive fewer bytes than requested. Previously, the pool assumed that this only happens when the end-of-file is reached; however, this is not guaranteed. Should this assumption be violated then the data sent to the client will be corrupt. In practise, the pool’s assumption is true for Linux and local filesystems; however, the code has been updated to remove this theoretical cause of corruption.

Fix a problem where a door can trigger the pool to post-process a file many times. Each trigger starts a new thread, resulting in very large system load.

Fix that, during startup, the pool would fail when recovering a broken file that has access-latency and retention-policy determined by spacemanager.

Extend the nearline SPI so it provides plugins with the file’s path and provides an easier way of report errors.

spacemanager

Improve the logging and handling of transitory errors.

Changelog 2.11.12..2.11.13

fe10021
[maven-release-plugin] prepare release 2.11.13
a43d7a0
pool: Fix and align pool meta data recovery with current pnfs manager
53aecd7
pool: ignore duplicated mover kill requests
f6f4ae3
spacemanager: Making logging and handling of transient errors more robust
879fd51
pool: Fix read corruption in HTTP mover
842977f
pool: Extend nearline storage SPI with path and custom error codes
e0a10bd
libs: update to nfs4j–0.9.5
3564478
[maven-release-plugin] prepare for next development iteration

Release 2.11.12

Changes affecting multiple services

If a bug is found in dCache then it should be logged. For the older admin commands, an uninformative message was logged.

The LocationManager service is started automatically in the broker domain (typically dCacheDomain). This answers requests from other domains with instructions on how to connect to the dCache cluster (typically a star topology, other domains connecting to the dCacheDomain). With this release, the FQDN is sent rather than the hostname name. The node hosting the broker domain should be updated.

Previously, if a client attempts to write a file with Access Latency and Retention Policy that conflicts with the selected reservation a stack-trace was logged. This is now fixed: both the FTP door and spacemanager should be updated.

Previously, if a bug was discovered when starting up a service dCache would abort starting up with a non-informative log message. Now, the stack-trace is logged.

dcap

Fix a typo that resulted in the GSI-based and auth-based dcaps listening on the wrong port by default.

nfs

Log abandoned movers with the corresponding stateid.

pnfsmanager

When writing into dCache with SRM, the Access Latency (AL) and Retential Policy (RP) may be specified or omitted. Additionally, the client may specify a space reservation into which the file should be written. If the client specifies both, they must match.

In dCache, there are three mechanisms to support a client that specifies neither AL/RP nor space reservation: the directory can have AccessLatency and RetentialPolicy tags, the directory can have the WriteToken tag, there are a system-wide default AL/RP values.

Previously, dCache would reject uploads where the user-supplied AL/RP information does not match the AccessLatency/RetentialPolicy tags, despite the latter being intended as default values.

With this release, if the client specifies neither space token, AL or RP then the directory tags will be used. If a directory specifies both WriteToken and AccessLatency/RetentionPolicy tags, then these have to be consistent. If the directory conains a WriteToken tag and the client specifies AL/RP, then the client specified values have to be consistent with the WriteToken tag.

pool

When attempting to upgrade non-precious and non-cached files (e.g. a file marked broken), the receiving end of the migration module would answer twice: first (correctly) with a failure and then (incorrectly) with a success. This is now fixed.

spacemanager

With RDBMs, transactional deadlock rollbacks are normal behaviour, which happen when the database must choose between two conflicting and concurrent changes. Previously, spacemanager would aggressively retry when this happens. This has been observed to trigger performance degradation. This release includes several strategies to minimise the impact of this.

Previously, the default number of space-manager threads was the same as the number of database connections. This does not take into account that there is background activity that also needs access to the database. With this release, the number of threads has been lowered; having a large number of threads also increases the likelihood of seeing transactional deadlock rollbacks.

Fix the shutdown sequence of spacemanager. Previously, shutting down a busy spacemanager could lead to attempts to modify the database after all database connections were closed; such failures were logged.

Reduce logging on various DB errors; generic transient errors are now warnings and transactional deadlock rollbacks are logged at debug level.

Changelog 2.11.11..2.11.12

331b934
[maven-release-plugin] prepare release 2.11.12
3985c5e
LocationManager: Use fqdn instead of hostname
42dcac4
spacemanager: Controlled shutdown
82a24d3
spacemanager: Make request processing more robust
d68115b
spacemanager: Reduce log level on various transient DB errors
f820942
spacemanager: Minor simplification to link group updates
1275b07
spacemanager: Don’t log stack-trace on AL/RP/Reservation conflict
21bc585
spacemanager: Lower default for number of threads
324caca
nfs4: log abandoned movers with WARN
825f790
dcap: Fix typo that causes GSI and auth dcap to use the wrong port
9206f80
cells: log bugs found by CellShell
41f1e23
cells: fix how bugs are reported from ac_ command.
3231098
pool: Fix bug in migration module upgrade logic
44c4727
pnfsmanager: Fix upload to space token that conflicts with AL and RP tags
21703ff
[maven-release-plugin] prepare for next development iteration

Release 2.11.11

Changes affecting multiple services

Various scripts, including the dcache command, invoke the java command with a list of directories in which Java should look for support libraries. Previously, the current working directory was (mistakenly) included in that list. This could lead to odd behaviour; one particular example is running a dcache database command from the /etc/dcache directory. This release fixes this problem by excluding the current directory.

Normally, if running an admin command in some cell triggers a bug then the log file of the domain hosting that cell will contain a stack-trace. Previously, for certain admin commands (spacemanager and sweeper ls in the pool) this did not happen. This is fixed with this release of dCache.

When there is some problem in the communication between domains and error message is logged. Previously the explanation for the problem was logged as “null”. With this release, a more descriptive explanation is provided.

The help text within the chimera CLI and for the pool’s migration commands was badly formatted; this release fixes this.

spacemanager

By default, when logging messages some contextual information is included (in square brackets). This typically includes the cell name and the kind of activity that triggered the log message. Previously, some messages omitted all contextual information, which is fixed with this release of dCache.

Fixed the ls spaces and ls files commands so they do not fail if there is a reservation without an owner.

All space-reservations may have an owner: a username or an VOMS group; ownership may be further restricted by VOMS role. When created through the admin interface, a reservation’s ownership is optional. With previous versions of dCache, if a space-reservation has no owner then anyone can release it. With this release of dCache, a reservation without an owner may only be released through the admin interface.

Changelog 2.11.10..2.11.11

3502237
[maven-release-plugin] prepare release 2.11.11
d6663cc
spacemanager: Change release authorization for unowned reservations
6907a8c
Fix valueSpec help parser
d55e512
spacemanager: Fix NPE in listing space reservations
4b3f32c
Maintain CDC of threads in decorated thread pool
717520d
spacemanager: Log unexpected exceptions
6d566d6
tunnel: use toString if IOException#getMessage returns null
f1a8a4b
Exclude cwd from classpath
8470f23
[maven-release-plugin] prepare for next development iteration

Release 2.11.10

Changes affecting multiple services

Fix potential database connection leak for nfs door and pnfsmanager service. This was triggered when attempting to create an already existing non-zero level via the .(access) or .(use) dot commands.

This patch fixes a race condition in Chimera that affects the nfs door and the pnfsmanager service. The effect is that, if two clients attempt to delete the same target (a file, link or directory) at the same time then the nlink count for the parent directory is decreased twice. “At the same time” means within the time taken to process the deletion; this is instance-specific but should be much less than 1 ms for well-configured systems. Sites can repare any incorrect nlinks with the following SQL:

UPDATE t_inodes SET inlink = (
    SELECT COUNT(*) FROM t_dirs  WHERE t_inodes.ipnfsid = t_dirs.iparent
) WHERE itype = 16384;

This is safe to run on a running production instance, but may take some time and will affect dCache’s responsiveness while running.

gplazma

The description for how to migrate away from using the forbidden useGPlazmaAuthorizationModule and useGPlazmaAuthorizationCell properties had caused confusion. The description has now been updated to be more explicit.

httpd

Fix filtering boxes and sorting on Pool Admin, Pool Usage, Poolgroups, Space Tokens and Tape Transfer Queue.

pnfsmanager

When a user uploads a file via SRM, a directory is created under the update directory (/upload by default). Should this fail, the upload will fail; however, this was not logged. With this release, pnfsmanager now logs why such failures happened.

pool

Previously, the NFS mover stopped when the client disconnected. This had two problems: a client that never connects to the pool leaves a mover that never dies and file transfers are not robust against transitory networking problems, despite the client attempting to reconnect. The latter problem is particularly bad when writing data as falling back to the door is not supported and such failures are not handled well by the Linux kernel.

With this release of dCache, if the client is not connected then the mover queries the door every 7.5 minutes to check it is still needed. The mover dies only when the client closes the file, the door declares the client is lost or the nfs door is stopped or restarted.

IMPORTANT Any pool upgraded to this version of dCache that is used for NFS transfers requires all NFS doors to be upgraded to this version or newer.

Currently, if there is a problem while a file is being uploaded using HTTP chunked transfer encoding then dCache will contain an incomplete file with no error mentioned in any log. With this release, such partially uploaded files will still exist (due to the partial upload semantics of pools) but the problem is logged on the pool and with billing.

spacemanager

Add a -blocking option to the update link groups admin command to ensure all subsequent commands will use the updated information.

srm

When uploading files, the client may choose not to specify an access latency, a retention policy or a space reservation for these files. Likewise, when reserving space, the client may choose not to specify an access latency for the reservation. Previously, the detailed view of the admin ls command showed null for these fields if the client did not specify them. With this release, those fields are omitted if they have no useful data.

Uploading a file with SRM involves three steps: preparing for the upload (srmPrepareToPut), uploading the file, marking the upload finished (srmPutDone). The third step can fail but previously the response from dCache is always Upload failed.. With this release, a meaningful error message is returned.

Changelog 2.11.9..2.11.10

958f95d
[maven-release-plugin] prepare release 2.11.10
2e2af56
pool: Propagate HTTP mover failures to pool
1038b1d
chimera: fix potential transaction leak on error path§
ff87157
gplazma: update error message for forbidden properties
d214784
spacemanager: add blocking option to update link groups command
6b13868
srm: don’t list absent information in ls
8a2a289
pnfsmanager: log problems when creating upload directory
4d06447
(2.11) webadmin: make jquery selector specific to individual tables
25c16bb
(2.11) webadmin: restore missing components to respect jquery script options
c1fbc7f
srm: include the reason why upload failed
f879dc8
chimera: fix race condition on remove
25d6021
webadmin: ensure unique id attributes for all (currently) tested UI elements
a14c10b
pool: update NFS transfer service to validate inactive movers
8433507
[maven-release-plugin] prepare for next development iteration

Release 2.11.9

ftp

Fix default value for ftp.authz.readonly for plain (unencrypted) doors. This restores the default value to the dCache v2.6 default value of true.

Add the DN of a user in the access log file for “Grid FTP” access.

The response from the plain (unencrypted) ftp door if the user specifies the wrong password is badly formed. Althogh it is possible that some clients are robust against such incorrect responses, with this release the ftp door responds correctly.

nfs

Prevent leaking memory if a client’s data transfer is proxied (i.e., no use of pNFS) and there is a communication error between the pool and the nfs door.

webdav

Fixes a bug where, if a double-slash is present, all parts of the path leading up to the double-slash are ignored; for example, with the bug, a path like /a/b//c/d is handled as if /c/d was specified. With this release, double-slashes are treated like single slashes; the above example is handled as if /a/b/c/d was specified.

Fix the NullPointerException triggered if client attempts to upload a file as a child of some existing file.

Changelog 2.11.8..2.11.9

03ae138
[maven-release-plugin] prepare release 2.11.9
425e3a1
ftp: fix response if user fails to authenticate to weak FTP door
aefbf0a
nfs-proxy: remove proxy adapter on IO errors
ce99f22
ftp.authz.readonly is by default set to the string
781b1f5
webdav: fix double-slash bug by upgrading to patched milton
abcdb54
webdav: fix NullPointerException when PUT as a child of a file
e2431a8
ftp: add user to access log
3c4f8cf
srm-client: reduce default logging for srmfs
5efe3cc
[maven-release-plugin] prepare for next development iteration

Release 2.11.8

httpd

Fix two minor issues when authenticating with the webadmin interface: “unauthorised access” and being redirected to the home page. The unauthorised access error can occur when selecting “Login” under the bird logo (top right corner); this is now fixed. The redirection problem occurs when selecting a tab that requires administrative privileges while not logged in; this redirects the browser to the login page. Previously, after a successful login, the browser was redirected to the home page. Now the browser is redirected to the selected tab.

info-provider

The previous bug-fix release of dCache included a regression in the info-provider. The SRM endpoint URL (which starts httpg://) omitted the port number. This is now fixed.

nfs

dCache v2.11.4 introduced support for different levels of write stability. Unfortunately, this exposed a bug in the Linux kernel, which is fixed with Linux v3.16. As a work-around, this release removes different-write-stability support.

pool

Fix logging that the HTTP mover returned an error so that a generic message (“An unexpected server error has occurred.”) is logged if no more concrete message is available, rather than “null”.

srm

The info command in the srm service shows the current number of requests in each state for each request-type, along with the maximum allowed. Previously, the total failed to include requests in READY state.

webdav

A previous bug-fix release fixed how dCache responds when the client attempts to DELETE a non-existent file. Unfortunately, this triggered a different problem where such activity results in a stack-trace that starts java.lang.ClassCastException: java.lang.String cannot be cast to javax.security.auth.Subject. This second problem is now fixed.

Changelog 2.11.7..2.11.8

93d9550
[maven-release-plugin] prepare release 2.11.8
3b88eed
Revert “info-provider: fix publishing SRM port number”
44fd387
pool: Fix logging in HTTP mover
8f9628c
srm: Fix calculation of total number of requests
d43ff04
webdav: Alternative to fixing return code of DELETE of absent file
a5ff2cc
nfs: always reply FILE_SYNC4 on write
97dc4fc
(2.11) webadmin: fix login redirect bug
3fb9619
[maven-release-plugin] prepare for next development iteration

Release 2.11.7

Changes affecting multiple services

The webdav and srm doors log SSL handshake failures as “General SSLEngine problem”. With this release, a more meaningful error is reported.

Reduce memory usage by avoiding multiple instances of common strings. Although this improves many services, pools are most affected.

Previous dCache releases included some of the code necessary for pcells, even though dCache made no use of this code. As part of on-going consolidation effort, this code has now been removed from regular dCache releases.

The cell.name is a core property within dCache. Its omission (most easily, by forgetting to specify pool.name) prevents the dcache script for working. The script is now robust against such errors.

If the dcache check-config command discovers that a deprecated property is being configured, it looks for the alternative property that should be used instead. In earlier releases, under certain circumstances, the wrong alternative property is selected. This is now fixed.

It is possible that a sudden burst of activity from many clients exceeds dCache capacity to queue such requests. Although dCache is designed to degrade gracefully under such circumstances, there existed the possibility of certain requests becoming stuck or memory leaking. This is now fixed.

info-provider

In previous releases, the info-provider assumed the broker domain is dCacheDomain. This assumption has been removed.

nfs

Upgrade to nfs4j v0.9.4, which brings several improvements to the door: adds the pnfs and nopnfs export options to allow or disallow PNFS. Fix export parsing if a host is mentioned multiple times; note that localhost must now have an explicit entry in the exports file. Fix a potential deadlock when closing a file on a busy door.

Allow an nfs door to avoid being published to loginbroker. Not being published in loginbroker means the door is unavailable to SRM and is not published by info-provider.

This release adds the kill client command, which tells the door to simulate being restarted to a specific client while leaving the other clients unaffected. This is useful when resetting a confused NFS client. This command should not be necessary. If useful, using this command points to a bug elsewhere. Therefore, such uses should be reported to the dCache developers.

pinmanager

If the maximum concurrency (pinmanager.cell.threads.max) exceeds the number of database connections (‘pinmanager.db.connections.max’) then there is a risk of pinmanager becoming deadlocked under heavy load. This release reduces the default maximum concurrency to avoid this.

pnfsmanager

Fix NullPointerException when a file is stored in a directory with an empty tag.

pool

Upon reloading the pool configuration an error is produced when an nearline storage was already defined in the existing configuration. This is now fixed.

A recent release fixed a bug that caused pool.mover.ftp.allow-incoming-connections to be ignored. Fixing that bug revealed another that caused the property to have the opposite effect. This is now fixed.

dCache pool configuration allows passing (fixed) arguments to the HSM script. In earlier releases, these arguments were supplied to the script in an arbitrary order. While HSM scripts should not depend on the order, they might; so, with this release, the order is preserved.

webdav

Although dCache behaves correctly if the client interrupts a proxied transfer; however, this is logged as a bug. This is now fixed.

Make HTTP third-party copy feedback and detecting when client disconnects more robust against internal dCache problems.

Changelog 2.11.5..2.11.7

7681aec
[maven-release-plugin] prepare release 2.11.7
61546fa
info-provider: fix publishing SRM port number
1900b32
[maven-release-plugin] prepare for next development iteration
a80959d
[maven-release-plugin] prepare release 2.11.6
3966630
webdav: don’t log a stack-track when proxy transfer is interrupted
b27b555
info-provider: remove dCacheDomain assumption.
1aa9550
fix broken commit
1a9e691
webdav: make progress markers more robust
61ba9d1
Fix compilation with Java 7
2ba017b
pool: Fix regression preventing configuration to be reloaded
fc42087
cells: Fix message timeout in case of thread pool overflows
6e59e20
chimera: handle NULL field of directory tags
5ddeb26
pinmanager: Adjust pinmanager defaults
231e5cf
Internalize common strings in StorageInfo
27a8c05
bootloader: Don’t fail on missing cell name
603334c
check-config: fix warning for deprecated properties
e75afd3
(2.11) gitignore additions for IntelliJ
71ef741
libs: move to nfs4j–0.9.4
e1d6930
nfs4: add command to kill client by server short hand id
fb78d4e
webdav,srm: provide useful log messages for SSL handshake failures.
3f879f2
pool: Fix regression causing FTP movers to default to proxy mode
ad66c99
nfs: Allow not to publish to loginbroker
be0d7cc
pool: Preserve HSM option ordering
8187d37
[maven-release-plugin] prepare for next development iteration
3210727
cells: Delete pcells related classes

Release 2.11.5

Changes affecting multiple services

Although dCache system configuration property names do not contain spaces, it is possible to define such properties. Previously, doing so breaks the dcache command. This is now fixed.

Update for the nfs door and the pools that allows faster restart. The door is also updated to be more responsive to namespace operations when suffering large number of proxy reads.

If the number of concurrent connections is set to -1 then the ftp and dcap doors will leak memory, eventually triggering an out-of-memory error that will restart the domain. This is now fixed.

admin

Fix erronous reporting of bugs when the user supplies incorrect arguments to an admin command.

alarms

Fix issues from the alarms service when it shares a domain with other dCache services. This could lead to errors when shutting down dCache.

Fix resource leaks and shutdown problems with the alarm service.

dcap

Fix the permissions check for bring-online with plain dcap and a NFSv3 mounted dCache.

Fix regression against v2.6 and earlier dCache in how gsidcap doors are known to SRM and how they are published in BDII/GLUE.

ftp

The ftp door supports the HELP command as some clients use this commands output to discover if certain optional functionality is available. This release fixes the output from this command.

The ftp door’s access log records some responses without the corresponding command-line from the client; subsequent responses (if the response is multiple-lined) and all responses due to STOR and RETR commands were affected. This is now fixed.

nfs

The info command now lists the proxy adaptors. A proxy adaptor is created when the client rejects the pool and falls back to reading data from the door.

Improve shutdown of nfs door when the embedded portmapper is used.

pool

Improve which interface the ftp mover selects when the client is redirected to the pool. In particular, the door will only redirect the client if the IP protocol matches (IPv4 vs IPv6); for example, if a client connects with IPv6 and the pool has no IPv6 address then the door will now proxy the data connection.

The pool.mover.ftp.allow-incoming-connections property had no effect. This is now fixed.

Add the IoMode (whether the mover is accepting new data, or supplying existing data) to the status line describing the mover.

Improve shutdown of nearline storage subsystem; now dCache will attempt to cancel all ongoing activity before shutting down the pool. The previous asynchronous approach could lead to IllegalStateException being logged if the pool was busy at that time.

spacemanager

Adjust spacemanager schema migration to be aware of earlier dCache bugs and to work-around site-local indexes that clash with new indexes that dCache needs.

In previous versions of dCache, both spacemanager and pnfsmanager had default access-latency and retention-policy settings. The spacemanager default retention-policy is only used if the admin doesn’t specify a value when creating a reservation through the admin interface. The spacemanager default access-latency is also used as a default for the admin command; additionally, it is used when processing an SRM reserveSpace request that omits the (optional) access-latency information — the retention-policy is mandatory in such SRM requests.

With this release, the spacemanager no longer has default access-latency or retention-policy properties. Creating a space-reservation in the admin interface now requires specifying both the access-latency and retention-policy. When processing an SRM reserve space request without an access-latency, the spacemanager will check from which linkgroups the user is authorised to reserve space. It will choose an access-latency based on the request’s retention-policy. If the user is authorised to reserve space from link-groups such that both access-latency options (ONLINE and NEARLINE) are possible, space-manager will prefer to reserve ONLINE if the retention-policy is REPLICA and NEARLINE if the retention-policy is CUSTODIAL.

srm

Previously, if the SRM client requests listing a directory, specifies a non-zero offset and does not limit the response size then dCache would fail this request with an IllegalArgumentException. This is now fixed.

webdav

With the recent upgrade of the Milton library some new behaviour was introduced. One example is that, under certain circumstances (and to support certain clients) the Milton library returns a 401 (not authorised) when attempting to delete a non-existing file. Unfortunately, this change then broke ATLAS clients. This patch updates dCache so it returns 404 (not found) under these circumstances.

Fix the response when a client requests a byte-range beyond the end of a file. This is necessary for compatibly with ARC clients.

Changelog 2.11.4 to 2.11.5

78f8332
[maven-release-plugin] prepare release 2.11.5
fd0d631
webdav: Fix reported content length for partial GETs
5f077e8
ftp: ensure client command line is available in access log
7d5f276
webdav: Fix return code on DELETE of absent file
7e25854
srm: fix semi-infinite ls range with non-zero offset
c2df681
pool: Partially fix interface selection for FTP mover
aac1473
pool: Fix typo that breaks pool.mover.ftp.allow-incoming-connections
97fbb48
shell: fix shell oracle for configuration keys with a space
12f2cfe
nfs: show current proxy-io transfers in the door
7bc26c6
chimera-cli: fix chown
ecbaf18
dcap: Fix regression in published protocol family
7990ab6
ftp: fix help output
efe526e
loginmanager: Fix leak caused by absent child limit
3fa5684
spacemanager: Fix null constraints and other schema migration issues
1e00567
alarms: Fix several resource leak and shutdown problems
41a525c
alarms: Detach alarms appender on shutdown
abcc92e
dcap: check for url in some commands
9fc7213
pool: Fix shutdown of nearline storage subsystem
5d33937
admin: Fix error reporting
ecbc33a
pool: add Io mode into mover’s status line
7062f68
spacemanager: Get rid of access latency and retention policy defaults
49ecc70
nfs: stop embedded portmap on shutdown
88daa03
libs: update to nfs4j–0.9.3
ae567ab
[maven-release-plugin] prepare for next development iteration
70349c7
dcap: refactor PnfsSessionHandler to unify permission check and url handling

Release 2.11.4

Changes affecting multiple services

With this release, the srm, webdav and httpd doors will log SSL handshake failures.

ftp

Relax requirements for the EPSV and EPRT commands. Previous versions of dCache would reject all RFC 2428 commands from an IPv4 client. This has been relaxed with this release. Now, IPv4 clients can use EPRT. IPv4 clients can also use EPSV if delayed passive is enabled.

info

Info service includes a safety feature that prevents it from bombarding the rest of dCache with too many messages. With this release, the safety limit is reduced allowing info to send messages more often.

Reduce the delay between subsequent messages of the same type. This allows the info service to recover information more quickly after being restarted.

pool

The NFS specification allows the server to specify multiple addresses when telling the client where to connect; for example, specifying both an IPv4 and an IPv6 address, or both addresses for multi-homed machines. This requires the client to choose the appropriate interface. For Scientific Linux 6, the kernel client will always use the first supplied address in the list and fail if it cannot access the pool with that address. With this release, pools will order the list, using heuristics to select which IPv4 address is “correct” and list it first.

Prior to flushing a file to tape, the pool checks if the name-space entry still exists and deletes the replica (rather than flushing) if not. Version 2.10.7 introduced a regression where, if this happens, there is a IllegalStateException stack-trace. This is fixed with this release.

The NFS protocol includes support for syncing when writing data, either as an explicit command or as a flag within a write request. After sync-ing, queries against the namespace (e.g., the file’s size) are expected to reflect the file’s state when sync-ed. This release introduces support in dCache for this feature.

Allow NFS client to detect when the pool is restarted, so has lost any pending (i.e., not committed) writes. NFS-compliant clients will handle this situation without data loss.

poolmanager

With previous dCache versions, the WAAS selection algorithm had a bug where it could (mistakenly) consider pools full if all pools had very fresh files. This is fixed with this release.

spacemanager

Fix various issues to improve the robustness of spacemanager: handle failed uploads correctly, handle files deleted during upload correctly, increase robustness against dCache loosing (internal) messages.

webdav

Upgrade to Milton v2.6. This fixes the buffering problem where a proxied vector read request results in the entire file being written to a tmp directory and not deleted. With this release, requests for 100 kiB or less data result in no data being written to disk; requests for more than 100 kiB are still written to disk, but only the data needed to satisfy the request is stored and the file is deleted once the response has been sent. Some issues persist: data isn’t deleted if there is a failure sending it to the client and the whole file is requested from the pool.

Changelog 2.11.3 to 2.11.4

1256cc0
[maven-release-plugin] prepare release 2.11.4
f703858
ftp: Relax requirements for EPSV and EPRT
ccbf4ad
nfs: return correct verifier on WRITE and COMMIT
8682625
nfs: set file size in namespace on stable_how is FILE_SYNC4
fe8e4b8
poolmanager: Fix full pool detection for WASS
4411aaa
srm,webdav,httpd: Log SSL handshake failures
ee165e7
ftp: Relax requirements for EPSV and EPRT
a510d81
Upgrade to Milton 2.6
7b58ae1
info: Reduce safety limit to 50 ms
b779e2f
pool: reorder ip addresses returned to NFS client
6ec9e9f
pool: Fix ISE when flushing deleted file
0254527
info: Reduce delay between messages
fa57fa3
spacemanager: Fix various error recovery scenarios
bacdb02
[maven-release-plugin] prepare for next development iteration

Release 2.11.3

Changes affecting multiple services

Restarting a domain within an active dCache instance can lead to a domain receiving messages for a cell as it is starting. Strict control is require to avoid the cell attempting to process messages before it is ready. This release fixes one place where this control was missed, which could lead to a NullPointerException. While this problem can affect any core dCache service, it was noticed with the spacemanager service.

httpd

dCache versions including and after 2.11.0, 2.10.9, 2.9.12, 2.8.16, 2.7.21 and 2.6.36 required sites to delete existing RRD files when upgrading; i.e., run the command rm -f /var/lib/dcache/plots/*.rrd when the domain hosting the httpd service is stopped. This release reverts that change, but requires sites that have already upgraded to repeat the rm command. Sites upgrading from an earlier dCache version do not need to delete anything.

Fix a potential NullPointerException in poolCollector.

nfs

Protect against a NullPointerException if the client attempts to read the contents of a file’s level where that level exists in the database but contains a Nil value. This does not happen under normal circumstances.

pool

Previously, any problems found when the xrootd client is writing to a pool were silently ignored: neither the pool, the client, or billing appreciated there was a problem. With this release, problems are logged in billing, reported back to the client. The client is also disconnected.

The Berkeley DB, which may be used to store file metadata on the pool, does not like being interrupted. The pool tries hard to avoid interrupting reading or writing; this release fixes one place that slipped through.

Fix regression in the output from the info command: it did not include statistics about the number of HSM requests and HSM timeouts. Also fixes how active HSM jobs are counted: cancelled jobs are still active until the underlying job as ended.

Improve the error message reported back to the xrootd client should its request have an invalid or missing UUID.

Changelog 2.11.2 to 2.11.3

74ce914
[maven-release-plugin] prepare release 2.11.3
0d23d50
imera: protect against NPE in FsSqlDriver#read
46816f0
xrootd: Propagate mover errors to dCache
5372282
pool: Restore flush and stage stats in info
2c87025
pool: Improve xrootd error message on missing UUID
a22ecc5
Fix NPE in cell initialization
393cf34
Fix NPE in httpd service
1768254
pool: Avoid interrupting Berkeley DB in migration module server
d1cfc7b
dcache-webadmin: revert rrd data source names
2342757
[maven-release-plugin] prepare for next development iteration

Release 2.11.2

Changes affecting multiple services

dCache will discard queued internal messages where the sender is no longer expecting a reply (due to internal time out); this helps an overloaded system recover quickly. Such discarded messages are logged. This release provides better logging when this happens: string commands are logged correctly and both the time-to-live and the age have been added.

The xrootd door and, on the pool, the HTTP and xrootd movers logging included incorrect context information, always showing the first connection. This has been fixed.

httpd

On startup, httpd logs a warning has uncovered http methods for path: /. This has been fixed.

Changelog from 2.11.1 to 2.11.2

4334f9a
[maven-release-plugin] prepare release 2.11.2
edde62f
cells: Refine TTL discard message
12f082e
xrootd,pool: Fix xrootd and http logging context
a1fde31
jetty: update webdefault.xml to adhere to new specification
8862efd
[maven-release-plugin] prepare for next development iteration
c8ff984
(2.11) dcache-webadmin: change Jetty setting so .war is not unpacked

Release 2.11.1

Changes affecting multiple services

Doors report to billing when a file is deleted. Previously, many doors neglect to include the PNFS-ID and sent only the path. This can be ambiguous so, with this release, all doors send the PNFS-ID in addition to the path. Some doors also failed to send the file size (if known) and client IP address. These, too, have been fixed. In summary: * dcap additionally sends: PNFS-ID, file-size and client address. * ftp additionally sends: PNFS-ID * srm additionally sends: client address. * webdav additionally sends: PNFS-ID and file size * xrootd additionally sends: PNFS-ID.

gplazma

Update ldap plugin to support both RFC 2307 and RFC 2307bis LDAP schema types. Additionally, the memory footprint of this plugin has been reduced and its performance improved.

nfs

Update nfs4j to v0.9.2. This brings some small performance benefits when listing a directory and for reading data through NFS v3.

Fix bug where, when a client read or write activity is proxied, the corresponding dcap mover wasn’t removed if client didn’t wait for the queued mover to start. Previously, such movers would accumulate on the pool.

srm

Do not download a DTD file from java.sun.com when starting up.

webadmin

Display the command’s output in the cell admin page to use a monospace font.

Remove a javascript error due to missing clojure dependency.

Sending commands in cell admin page is fixed.

xrootd

Improve the xrootd logging plugin to provide additional information.

Changelog from 2.11.0 to 2.11.1

7a3f37a
[maven-release-plugin] prepare release 2.11.1
f744732
doors: Add pnfs ID to billing remove entries
19faee9
webadmin: Fix ClassCastException in cell admin
6d40f30
Fix compilation for Java 7
362801a
webadmin: remove redundant head element in alarms panel html
5173d34
dcache-webadmin: change output field of cell admin page to monospace font
302f675
srm-server: add servlet webapp–2.3.dtd file
f44f137
(2.11) dcache-webadmin: eliminate clojure dependency
5f3fc62
libs: update to nfs4j–0.9.2
68b551b
dcache-xrootd: Extend and improve xrootd access log plugin
d18b88c
gplazma-ldap: pull only required attribute
d581502
Revert “(2.11) dcache-webadmin: change Jetty setting so .war is not unpacked”
a589e13
gplazma-ldap: support uniqueMember based group membership query
07935f7
(2.11) dcache-webadmin: change Jetty setting so .war is not unpacked
a11f5a9
nfs-proxy: kill mover on close
0f5ec5a
[maven-release-plugin] prepare for next development iteration

Release 2.11.0

Changes affecting several services

Short session strings

FTP, WebDAV and NFS doors now use BASE–64 encoding for the session counter. For such doors, sessions have the form door:<cell>@<domain>:<counter>. The <counter> part used to be base–10 encoded (i.e., a normal integer number). With dCache v2.11, this part is now a base–64 number. The result is shorter log lines and billing entries.

Improved shutdown

When a dCache domain shuts down, it asks each cell in turn to stop. As a backup procedure, dCache will wait up to one second for all threads associated with that cell to terminate before explicitly interrupting any of that cell’s remaining threads. The dcache stop command, which triggers the domain shutdown, will wait up to ten seconds for the domain to finish shutting down. After this, it kills the java process outright. Therefore domain shutdown must finish before this deadline to achieve a clean shutdown.

With dCache v2.11, the shutdown procedure of several services (billing, httpd, loginbroker, pool, poolmanager, replica) have been improved by explicitly stopping background activity. Previous versions of dCache relied on dCache’s backup procedure for terminating left-over activity. By explicity stopping background activity, shutting down each of these services can now take less than one second.

Shutting down the pnfsmanager service has also improved. It is now faster, cleaner and safer. Previously, pnfsmanager also relied on the backup procedure to terminate on-going activity; this was unfortunate not only because it slowed down the shutdown but also that such activity could attempt to use the database after the database connection pool was stopped. This could result in errors being logged and risked partial processing of requests. With dCache v2.11, this has been fixed.

When shutting down services that use a thread-pool for processing messages (gplazma, pinmanager, pool, srm), the shutdown sequence now ensures all threads have stopped before closing resources. As with pnfsmanager, previous versions of dCache risked concurrent activity continuing after database connections have been stopped. This is now fixed.

Improved HTTP services

Jetty is the web-server framework used by the webdav, httpd and srm services. In HTTP 1.1, a network connection is kept open unless the client explicitly requests it be closed. To the user, subsequent requests appear to be processed faster as there is no overhead from establishing a new TCP connection and SSL/TLS/GSI context.

In previous versions of dCache, each HTTP connection occupies a thread, even when waiting for the client to send the next request. Such occupied threads, cannot process requests from other clients. This made tuning dCache for optimum performance difficult: many clients connected but not issuing requests will result in high memory usage, but reducing this number can impact on the server’s ability to handle concurrent requests. Safe configurations will likely lead to dCache under performing under high load. With v2.11, dCache will free up the thread while waiting for the client’s next request, making it easier to obtain optimum performance.

A separate improvement is that dCache reacts better when faced with many clients attempting to connect at (nearly) the same time.

In HTTP 1.1, if the client decides to keep the connection open, the server may still close it later on. In dCache, the policy is to close any connection that remains idle for too long. Idle means no traffic flowing in either direction; either dCache is waiting for the next client request or the client is waiting for dCache’s response.

By default, srm service will close connections after one minute of inactivity; the webdav service will close idle connections after five minutes of inactivity; and, for httpd service, connections are closed after 30 seconds of inactivity. These idle timeouts may be adjusted through dCache configuration properties.

Jetty also has the concept of being low on resources. This is triggered when all request processing threads are active and there are more clients waiting to connect; for example, when many clients attempting to connect at (more or less) the same time.

When Jetty detects it is low on resources, it can react by reducing the timeout for idle connections. By more aggressively closing idle connections, the service can “degrade gracefully” when faced with a sudden inrush of client connections; the users experience only a slight delay in receiving replies as the necessary resources are quickly freed. The srm and webdav services reduce idle connection to 20 seconds and 30 seconds respectively, whereas httpd does not react.

In previous versions of dCache, when low on resources, the more aggressive idle-disconnection policy would affect only new connections; the existing idle connections would enjoy longer timeouts. With dCache v2.11, all SRM or WebDAV connections experience the reduced idle-connection threshold.

Improved deletion speed

The database updates needed to delete a file has been reduced, making deletions faster. The effect is most pronounced if the file contained many non-empty levels. This affects the pnfsmanager and nfs services.

Libraries

Many of dCache third-party library dependencies have been upgraded with this release. Please refer to the library project’s documentation for full details on the resulting improvements:

  • AspectJ: 1.8.2
  • BerkeleyDB: 6.1.5
  • DataNucleus: core 4.0.1, plugin 4.0.0-release
  • GSON: 2.3
  • HikariCP: 2.0.1
  • Jetty: 9.2.2.v20140723
  • netty: 3.9.4.Final
  • PostgreSQL: 9.3–1102-jdbc41
  • Scala: 2.11.2
  • SLF4j: 1.7.7
  • Spring: 4.0.6
  • Xerces: 2.11.0

Changes to services

The following section describes changes that affect a specific service.

alarms

With this release, the alarms service has received substantial improvements. These are described below.

A domain may now host both the alarm service and other services. As a consequence, the alarm-specific logback-server.xml file (located in the /var/lib/dcache/alarms/ directory for RPM packages) has been removed. As log messages generated by the alarm service are handled in the same fashion as other services, the service.log file has also been removed.

With dCache v2.11, alarm definitions are now either predefined or custom. Predefined alarms are automatically generated within dCache without requiring a match against some regular expression; these are the alerts defined by the dCache developers. Custom alarms are those that are must match some regular expressions; these are defined by the admin and are comparable to the alarms defined in earlier dCache releases.

All predefined alarms are sent at the ERROR level, so it is not necessary to set dcache.log.level.remote to below this level for the alarm service to receive all predefined alarms. It may still be necessary to adjust this property if custom alarms match against messages logged at a lower threshold. While this is supported, the same performance caveats exist as with earlier releases.

It is no longer necessary to mirror the alarms-custom-definitions.xml file between the alarms and httpd services. However, when using XML storage, the requirement to provide some shared storage between the nodes hosting alarm and httpd services continues.

The structure of the custom alarm definition or filter has been simplified. It now consists only of the following elements:

  1. type name
  2. keyWords
  3. regex
  4. regexFlags
  5. matchException
  6. depth

Note that the predefined Severity level, along with logger and thread ids, have been removed as filter attributes.

When an alarm fires, it is assigned one of four priorities (these replace the severity concept in earlier versions). In decending order, these are critical, high, moderate and low. By default, all alarms have critical priority. This default priority may be altered in the admin interface and individual alarms may be assigned a lower priority through the priority map.

The priority map is stored in the alarms-priority.properties file, which is located in the /var/lib/dcache/alarms directory in RPM packages. The priority map may be adjusted either by editing this file directly (and requesting dCache reloading it) or via the admin interface.

Note that, while there is still a Severity column in the database, it is no longer used.

The previous dCache shell commands to add, modify, remove and send alerts are still present. Their options have been modified slightly to reflect changes in this release. There is a new dcache alarm list command. This lists all predefined alarms.

The alarm service is now a well-known cell by default.

The alerts admin commands have been extended to allow creating custom alerts and assigning priority to alerts. The commands predefined ls and send have also been added to the admin interface; these behave as the similarly named dcache shell commands.

The optional alarm cleaner daemon now runs as part of the alarms service, not in the httpd service. The default values for its properties have been moved from httpd.properties to alarms.properties and renamed accordingly.

There are a number of new and modified alarms properties which are documented in the alarms.properties defaults file. Most significant of these are those for enabling email notification, which no longer requires direct editing of logback configuration files.

The alarms web page is basically the same, except that the “Severity” label has been replaced by “Priority”. That filter choice affects only alarms and not unmarked logging events. Since this is not a schema attribute, it is not reflected in the actual table, but alarms will be sorted there by priority first.

billing

The billing service no longer logs a warning on shutdown.

The billing log format has been extended to allow InfoMessage log entries to include the session identifier. This is particularly useful when understanding which SRM session is responsible for billing entries. The default format is not altered, so sites must adjust their configuration to include session information.

gplazma

Update gPlazma Login result printer to use a human-friendly label rather than a dot-separated-number for the SHA–512-with-RSA algorithm.

The kpwd plugin, which supports the proprietary kpwd-format files, has been updated to support multiple gids. Users with multiple gids have a comma-separated list of gids instead of the single gid value, with the primary gid the first gid in the list. A comma-separated list of gids is valid for both login and password lines.

info

The info-provider groups together pools that GLUE can report together for storage accounting. Such groups are called Normalized Access Storages (NASes). Previously, info tried to group pools as much as possible, which leads to confusion over the names. Now, a NAS represents all the pools accessible by the same set of links.

nfs

The NFS door and mover now shows correct units, nanoseconds (ns), when showing performance statistics for the different NFS operations.

The database query used to gather information for df command on an NFS mount has been improved. However, this remains an expensive operation.

The NFS door and mover are updated to nfs4j v0.9.1. This brings the following improvements:

  • Support for the all_root export option. This treats all requests from the specified client machines as having come from the root user. Created files and directories have the same ownership as the parent directory.

  • We ensure that file attributes (e.g., file size) are always up-to-date. Previously, zero file size was reported for a short period after closing a file written through NFS.

  • Some performance boosts when checking filenames.

  • Make deletes through NFS v3 faster.

pool

Some pool error messages have been expanded slightly to give some contextual information and to distinguish between otherwise identical errors.

The migration module is idempotent, which means running the same command twice has the same affect as running it once, assuming no pool has gone offline between running the command. While being idempotent is a useful feature, sometimes one might want multiple copies of a file, which simply running the migration module command multiple times does not achieve.

With dCache v2.11, the migration module commands now support the -replicas option. This allows the admin to specify the desired number of replicas. The pool will then create that number of replicas sequentially.

Previously, the HSM interface processed all HSM requests sequentially: one at a time, using a single core. This is particularly bad for staging requests as preparing the pool for the staged file may require IO. With 2.11, the HSM now processes requests concurrently, which allows CPU and IO activity to overlap and spreads this load over multiple cores.

The output from admin commands rh ls, st ls and rm ls are now sorted by creation time.

Previously, just before a pool starts accepting a new file, it will request the expected checksum from pnfsmanager. In the majority of cases there is no checksum, so this slows down dCache. In 2.11, this step is skipped; uploads are now (slightly) faster. A client may still supply the file’s checksum prior to upload (using FTP) and dCache will verify the checksum once the upload has completed.

pnfsmanager

As part of the general shutdown improvements, during shutdown the pnfsmanager service will reject all queued requests and return a “No Route To Cell” error. Requests that pnfsmanager has started processing are given 500 ms to complete before being forcefully terminated.

replica

Replica manager now correctly shuts down the database connection pool when dCache is shutting down the domain.

srm

With dCache v2.11, the srm service now includes a self-throttling behaviour when clients make more requests that the database can cope with. Previously, when overloaded, all database updates were dropped and a message like Persistence of request <request> skipped, queue is too long is logged. The result is that the database could hold incomplete or inaccurate information about SRM activity.

Such inaccuracy was acceptable as the database was used for debugging information. However, with the adoption of a separate upload directory and some support for recovering user activity after a restart, this is no longer true. Moreover, such debugging information is likely very useful when understanding any unexpected behaviour during an overload situation, so needs to have some accuracy in particular under such conditions.

SRM database updates naturally fall into two types: major and minor. Major updates record new requests and transitions to a final state (SUCCESS, FAILURE, CANCELLED); minor updates record changes to some non-final state. In an overload situation, minor updates may be dropped but major updates will always be processed. If necessary, processing the SRM request will be delayed so that major updates are stored. This has a secondary effect of creating a back-pressure on clients, slowing them down and allowing the srm service to recover.

The admin command ls and ls completed have been refactored. There is now a single command ls that accepts the option -completed=<n> to limit the number of completed operations.

Closing SRM connections

With the adoption of Jetty v9, the GSI implementation has been rewritten to use Java’s built-in SSL support in a more direct fashion. The most immediate effect of this is that, by default, the shutdown sequence of SSL and GSI connections has changed. Before, the server would simply close its end of the network connection; now, the server issues a CLOSE alert that notifies the client of the impending close before closing its end of the connection.

Sending a CLOSE alert is part of the SSL v3 and subsequent TLS protocol specifications. Previous versions of dCache were in error by not sending such alerts.

Although most SSL/TLS clients expect the CLOSE alert, the dCache SRM clients (prior to v2.10.7) considers any alert as an indicator that the operation has failed. Depending on the client configuration, it might or might not retry the operation several times if this happens. In either case, the return code and output will reflect a failed operation, whereas the operation may have succeeded.

This has been fixed with SRM-client release v2.10.7: this version accepts the CLOSE alert but does not require it. It is compatible with dCache v2.11 and earlier versions.

To allow a smooth transition, dCache includes the srm.enable.legacy-close configuration property. When set to true (the default) the srm service will close connections without first issuing the CLOSE alert, in keeping with previous dCache behaviour.

Future versions of dCache will change the default and eventually remove this option. Therefore sites are invited to upgrade their clients.

transfermanagers

TransferManager now generates IDs based on the current time rather than by taking the next SRM request ID. This means the transfermanagers service no longer needs access to the SRM database.

Scripts

The startup time for the different dcache billing commands has been reduced.

Changelog from 2.10 to 2.11.0

15d83f6
webdav: correct error message for incorrect Credential header value
e414a4c
srm-client: delegate by default for https 3rd-party transfers
65d7504
chimera: Add messages to JdbcFs#move exceptions
111e553
poolmanager: Warn when spacecostfactor is too big
bb8c9f5
srm-client: add srmfs to RPM manifest
b9b1ecd
poolmanager: Warn when spacecostfactor is too big
11e88a6
doors: Base64 session counter
84c8587
pnfsmanager: Check file type before overwrite
b459316
srm-client: Report srmPutDone failure
6885adc
chimera: Refine IOException to ChimeraFsException
f455fdc
check-config: Produce error when using scoped properties
d16ccfb
srm-client: Add interactive SRM shell
10d005e
poolmanager: Kill watchdog on shutdown
d934502
pool: Kill sweeper thread on shutdown
4a0c600
httpd: Kill transfer observer threads on shutdown
7626a41
loginbroker: Kill worker threads on shutdown
6919e26
pnfsmanager: Controlled shutdown of processing threads
d6754d5
loginmanager: Associate ScheduledExecutor with the cell’s thread group
423fe0d
cells: Orderly shutdown of multi-threaded cells
6939d2e
mark internal <service>.db.xyz properties as immutable
fa10ecd
nfs: mark unused nfs.db.host property as forbidden
9fd5d7a
pom: update to nfs4j–0.9.1
0990a8d
chimera: Avoid coupling to hikari and alarms in JdbcFs
fad06a4
replica: Shut down message processing thread on cell shutdown
219bc9c
replica: Proper shutdown of database and avoid database singleton
50421f4
billing: Improve shutdown for db backend
c6073a5
Minor adjustment to NoRouteToCell error in LoginBrokerHandler
e9595f5
poolmanager: Shut down request container thread pool during shutdown
04fcb12
chimera: fix unit tests
b721145
pool: Fix pool size health check in case of asynchronous release of space
31dfdba
spacemanager: Suppress transient deadlock resolution errors
dbd6b70
pool: Fix race leading to false positices in pool size health checks
c91e195
pool: Fix ISE in CacheEntryImpl#toString
f743599
pool: Fix ISE in ‘rep ls’
904336c
webadmin: make pool queue data enum Java 8 compatible
b315fc0
webadmin: fix exit login in billing refresh loop
336ce07
pool: Make migration task cancellation more robust
272bfe8
cell: Log exceptions within the correct cell context
55de0de
pool: Fix pool selection bugs in migration module
201685d
admin: Propopage NPE to user
86f6f6b
chimera: query for inode generation on readdir
c84b19c
pool: ensure that mover error message always set
55eb29f
nfs4: shutdown state handler on destroy
bc0e891
billing: Fix date handling for json and yaml output
ef3d069
billing: Add file size to request records
ea8b501
billing: Improve error message on failur to parse date arguments
6c409d3
billing: Reduce startup time of dcache billing command
bdd5989
billing: Add billing format for PoolHitInfoMessage
d050876
webadmin: minor improvements to rrd4j-based pool-queue plots
2976ef1
pnfsmanager,doors: Add user relative upload directories
a85ef3c
billing: Expose session identifier
1da82a4
rpm: include logrotate config file
23cfdc5
pool: Fix interpretation of eager option of migration module
d88a98b
srm-client: Output file level errors for sync requests
f7fc29d
pool: Respect LAN port range for internal srmcp transfers
05e5f62
srm: Fix listing of failed, done and cancelled jobs
8378b5e
cell: Prevent interpreter stack overflow from killing the domain
a85817f
cell: Declassify eval failure as a bug
eefea09
system-test: add ‘start’ profile
1af1d75
spacemanager: fix forbidden property message
1321d431
ftp: Fix root path validation for upload directory
6651bb2
xrootd: Upgrade to xrootd4j 1.3.5
96e8a94
pool: Fix minor errors in migration module documentation
4dfca00
pool: Add JavaDoc to migration JobDefinition
47b65f6
pool: Allow multiple replicas for migration jobs
32505e9
pool: Decouple migration task from migration job
be91838
pool: Prevent interruption of replica deletion during HSM flush
8bb52a8
pool: Fix HSM cancellation
cde169e
dcache: Let login cache propagate IllegalArgumentException
43ecb8c
dcache: Do not cache gPlazma timeouts
e837405
test: Upgrade to mockito 1.10.4
b456232
dcache: Use Mockito for CacheLoginStrategy unit tests
32c877f
dcache,nfs,dcap,srm: Drop duplicate LoginStrategy cache
788b56c
cells: fix IAE in ‘show pinboard’
b234566
commons: remove period based gauges
204a953
commons: update RequestExecutionTimeGauges to accept units
26bb60f
commons: add TimeUtils#unitStringOf()
ae63e2e
commons: fix TimeUtils nanos to duration string conversion
1306124
hsm: Increase concurrency in nearline storage interface
d5a20b0
srm: Fall back to synchronous saving in case of overload
2558d56
info: Generate NAS by links rather than access
92b3668
pool: Fix xrootd vector read limits
8287987
chimera: fix NPE on ‘.(parent)()’ for root inode
25335b0
alarms version 2: move database cleanup from webadmin to service
2f92190
chimera: squash usedFiles() and usedSize() into a single query
22230c4
dcap: fix interaction with Spacemanager
d045b3c
srm-client: Fix logback configuration
98f2115
srm: Fix asynchroneous job storage leak
3fdcf9f
unittests: increase timeouts and fix race in DiskSpaceAllocatorTest
57df557
srm: Minor refactoring of ls command
1ab18b9
srm: Log common misconfiguration preventing upload
469b75b
srm: Fix return code and logging on “login errors”
3a11e20
alarms (common, dcache, dcache-webadmin, skel): version 2.
7175bf0
common: Fix race condition in AtomicCounter unit test
0dfc80e
srm: Fix deadlock like bug
7680a01
billing plots: use ‘transferred’ rather than ‘size’ for bytes
9f26723
configuration: adjust references to forbidden properties
14d493e
httpd: Use service specific property for list of login brokers
73c2601
xrootd: allow empty xrootd.authz.read-paths and xrootd.authz.write-paths
fed4abd
scripts: fix java invocation
d325d16
chimera: protect list initialization from FS inconsistencies
3273e4a
srm: Fix reporting of infinite space lifetime
70d414e
chimera: Fix creation time flag in ls command
3c6b4fa
Fix CacheException references in error messages
8ea811d
Mark deprecated properties as forbidden
a792239
Update to Jetty 9
650413c
dcache-common: change NetworkUtil getCanonicalHostname to prefer DNS name
626463c
srm: Make dcache-core independent of srm-server
c23ab8c
dcache-srm: Move AuthorizationRecord and related classes to dcache-srm
bdf4048
chimera: do not explicitly remove enties in t_level_x
c179f3d
srm: Align SRM info output with recent changes to scheduler configuration
f139c2e
webdav: remove unused import
38f4c2c
gplazma2: detect sha512 with RSA in LoginResultPrinter
72ac7f1
nfs4: add NFS file handle into NfsTransfer class
437191c
nfs4: invalidate vfs cache on successful write
7c8c18b
Revert “libs: update sshd to 0.11.0”
18c188c
Revert “libs: switch to jglobus–2.1.0-SNAPSHOT and bcpkix–1.50”
9d056fc
libs: update sshd to 0.11.0
e061819
libs: use nfs 0.9.0
bd94107
libs: switch to jglobus–2.1.0-SNAPSHOT and bcpkix–1.50
60d5765
Update netty and berkeley DB to latest version
d5b5f78
srm: Fix incomplete restore of “ready queued” jobs
00d2534
cells: Improve pinboard
a86a976
Avoid thread-unsafe use of SimpleDateFormatter
f5ae71a
dcache-spacemanager: Move space manager to its own Maven module
83a4fea
Update billing indexer to avoid deprecated Guava API
db28dc0
srm: Remove dead null check
1c14947
srm: Fix credential store registration
b3e9b74
srm: Fix restore of bring-online requests on restart
3c9e3fb
srm: Fix set max ready get command
5121943
pool,ftp: do not query for checksum on write
7d5cb72
libs: update to jglobus–2.0.6-rc7.d
95a193a
libs: update to nfs4j–0.8.1
e1cc820
chimera: Move new CLI to chimera and delete old CLI
507c3fc
build: Update to findbugs 3
9f8822e
dcache-info: Move info service to its own Maven module
f9957a9
dcache-srm: Move SRM specific dCache code to new module
1fb9ca4
chimera: mark as-run if i_dirs_ipnfsid exists
15d2a0a
solaris: fix solaris package script and add work-around for pkgmk bug
f2e9ded
acl: add heuristic to detect acl format
17e6de1
gplazma: make plugin creation caching optional
1db22cf
chimera: create missing index i_dirs_iparent
467ad69
pool: Resolve high memory usage and other issues in sweeper
53cfdd4
Update third party libraries
14797dd
info: Fix HashMap ordering assumption in unit test
d06ed79
Make JDK byte code verification bug workaround Java 8 compatible
c0a68c5
gplazma: Fix JVM implementation dependency in unit test
1adf0d0
poolmanager: Suppress stack trace in case of unmatch units
3425431
xrootd: Add billing entry on delete
f89706c
bugfix: fix reverseMap not throwing NoSuchPrincipalExceptions
f21d639
kpwd: add support for multiple GIDs
76f3812
Clean up Spring instantiation of database connection pools
61d7402
cells: Fix deadlock during startup
4b10fe5
Fix several issues in recent alarms changes
e547954
srm: generate SRM v2.2 Axis-generated code as part of build process
e013bb7
build: add work-around for JDK bug and PowerMock
b89cd06
pool: Fix concurrency settings for script nearline storage
38e3ac3
poolmanager: Fix ‘request clumping limit reached’ failures
abdb8d4
poolmanager: Avoid request leak
f4584c7
pool: Order nearline requests
8d1ce53
srm: Fix race condition in srmPrepareToGet
deb5996
webdav: Fix certificate handling for HTTP third-party copy
f9f9ccd
poolmanager: Fix excessive p2p and stage alive checks
c99aa7b
srm: Restore estimated wait time update
d2129e3
srm: srm,dcap: Use configured vomsdir and capath
06cf581
webdav: Kill mover in case transfer times out
797b966
webdav: Make threading and TCP connection limits configurable
c272c48
pool: Fix several minor issues in nearline storage subsystem
7f39db9
srm,dcap: Use configured vomsdir and capath
d207324
Upgrade to DataNucleus 4
ce76104
Don’t use legacy properties
658ff68
alarms: eliminate httpd.alarms.db.xml.path – missing deletion of batch check
4a66f6c
alarms: eliminate httpd.alarms.db.xml.path
325948d
alarms: convert definitions into marked alarms
759484d
poolmanager: remotve unused variable
c3acc43
src: move VspArgs into dcache-dcap module
ff5e042
alarms: decorate data source to capture connector errors as alarms
1ba6f6c
network utils: add constant for canonical host name
d386a4d
alarms: add fuller support for Markers
805b27c
srm: fix deserialise of TExtraInfo
852ea99
srm: fix NPE when checking delegated credential
00d2f1a
srm: fix NPE when checking delegated credential
5e46143
srm: fix deserialise of TExtraInfo
ed288b0
srm: fix IllegalStateTransition for COPY requests
dc34d0f
releases: update dCache version to v2.11