What’s new in dCache 4.2
Release notes
Highlights
This release ships with dCache-View 1.4+ which provides a new version of the familiar admin pages. Currently, access to these new pages requires the user to be authorized (and to assert) the admin role. The pages are only fully populated with their respective data when, in addition to the frontend service, the alarms and history services are running, and the billing service has the database backend enabled.
With this release, dCache-View completely replaces the (Wicket-based) “webadmin” interface, which is no longer available on ports 2288 and 8440. (The old static pages at http://<host>:2288/old are, on the other hand, still available, but these, too, will eventually be removed in a future release.)
Incompatibilities
Some changes (a11ec8d76b, 887d8d5738 and 2ff49b600c ) in the current release require that Pool Manager, Space Manager and Door instances must be updated to 4.2 at the same time. For users that wish to do incremental updates, pools may remain at 4.1 during that update.
Acknowledgments
The dCache team thanks Christoph Anton Mitterer, Martin Johnki, Onno Zweers and Johannes Thurn for their pull requests and contributions.
Release 4.2.51
srm
Now host IP is used for comparison when determining if SURL is local.
Changelog 4.2.50..4.2.51
- 5692bc2
- [maven-release-plugin] prepare release 4.2.51
- 1c84e8c
- srm: use host IP for comparison when determining if SURL is local
- 484c677
- [maven-release-plugin] prepare for next development iteration
Release 4.2.50
cell
Curator client was not able to restore the connection to ZK server after network partitioning. The is now fixed.
skel
The current relase fixed tape-reserved size calculation.
webdav
The current release fixed, where the WebDAV door failed to follow RFC 4918. This make some clients reject dCache WebDAV door as a valid WebDAV endpoint.
Changelog 4.2.49..4.2.50
- 248a753
- [maven-release-plugin] prepare release 4.2.50
- f724670
- Fix tape-reserved size calculation
- 7fd099a
- webdav: include DAV header in OPTIONS requests.
- 148fcf2
- cells: do not re-define zookeeper watcher
- 05e3758
- [maven-release-plugin] prepare for next development iteration
Release 4.2.49
webdav
The current release fixed an issue of transfers through dCacheView when the webdav door is configured with
empty webdav.allowed.client.origins
value, which is the default value.
Changelog 4.2.48..4.2.49
- aa442c5
- [maven-release-plugin] prepare release 4.2.49
- 9dc5382
- dcache: add null check to pool info collector util
- 22f99cd
- webdav: fix CORS when all clients are allowed to connect
- 4973b09
- [maven-release-plugin] prepare for next development iteration
Release 4.2.48
srm
The current release fixed a problem resulting in high CPU use in SrmManager if clients are attempting to pin a file and PinManager is unavailable.
A regression fixed where SrmManager will reject all QUEUED jobs and INPROGRESS BringOnline requests on restart, if there are no SRM doors running when SrmManager starts.
Changelog 4.2.47..4.2.48
- 7af11ca
- [maven-release-plugin] prepare release 4.2.48
- 9ec9e32
- SrmManager: fix handling of saved requests on start-up
- e6b17d6
- SrmManager: avoid spamming if PinManager is down
- bce8d98
- [maven-release-plugin] prepare for next development iteration
Release 4.2.47
doors
The current release fixed a bug where running the lb set tags
admin command without any
arguments triggers a NullPointerException.
pool
The current release improved error messages about jobs cancellation.
scripts
The dcache-storage-descriptor command no longer requires a URL argument.
Changelog 4.2.46..4.2.47
- 7540712
- [maven-release-plugin] prepare release 4.2.47
- 5ce4df2
- doors: fix “lb set tags” command with no arguments
- fe169d7
- pool: improve messages when migration job is cancelled.
- 61b8d64
- scripts: fix variable ordering in dcache-storage-descriptor
- 6f2213e
- [maven-release-plugin] prepare for next development iteration
Release 4.2.46
pool
Pool health-check log messages now include the pool’s name.
webdav
On an unsuccessful HTTP-TPC pull request, dCache will delete the file. If this deletion did not work then an error was logged. This is fixed now and failures to delete the incomplete file from a failed HTTP-TPC pull request, where the incomplete file has been deleted by some other means are now logged at DEBUG level, rather than WARN level.
xrootd
The current release refited checksum handling after xrootd4j bug fix.
Changelog 4.2.45..4.2.46
- 9cc8d6d
- [maven-release-plugin] prepare release 4.2.46
- a92589d
- dcache-xrootd: refit checksum handling after xrootd4j bug fix
- 5349dd5
- webdav: avoid logging non-error as an error
- c1cb9fd
- pool: include pool name in health-check reports
- 3488666
- [maven-release-plugin] prepare for next development iteration
Release 4.2.45
frontend
The current release fixed QoS pin semantics.
A bug is fixed in frontend that results in a NullPointerException for billing queries where no limit is specified.
Changelog 4.2.44..4.2.45
- 5fe7649
- [maven-release-plugin] prepare release 4.2.45
- 95694d9
- frontend: fix NPE if limit is not specified
- 34ca0ec
- dcache-frontend: fix QoS pin semantics
- fca79f8
- [maven-release-plugin] prepare for next development iteration
Release 4.2.44
Changes affecting multiple services
The Apache Commons Compress library used in dCache was updated to version 1.19.
A rare deadlock situation in the Chimera database was eliminated. In cases where, within the same directory, concurrent mkdir and rmdir events happened, transactions within the database could deadlock. This would be indicated by the message
ERROR: deadlock detected
in the logs.
pool
There were reports of extraordinarily high CPU usage on pool nodes with a large
number of cached files. Through an optimization of the sweeper
, CPU usage
was reduced significantly.
xrootd
This release fixes a vulnerability in dCache’s XRootD protocol implementation. We recommend that all sites update their XRootD doors. Details will be made available through EGI Security and, in a week’s time, through an update to these release notes.
Changelog 4.2.43..4.2.44
- 7bfe8eefac
- [maven-release-plugin] prepare release 4.2.44
- fb6913af6d
- dcache-xrootd: honor read paths when listing directories
- 246a96209f
- resilience: don’t compare Integer objects by refference
- 4fd1250396
- sweeper: use in-memory map instead of repository for histogram data
- c59972f06b
- dcache-xrootd: replace constants for version number
- cc59b68f91
- dcache-xrootd: update protocol version numbers
- 73f75af9b4
- libs: update apache.commons:commons-compress to 1.19
- ab0e78e949
- chimera: fix ABBA db deadlock when mkdir and rmdir run concurrently
- 5b12d34f43
- [maven-release-plugin] prepare for next development iteration
Release 4.2.43
dcap
dcap door could not handle out-of-date errors. This is now fixed.
gplazma
The current release fixed thread leak by explicitly close NamingEnumeration
httpd
The current release fixed escape status field in HttpPoolMgrEngineV3.
Changelog 4.2.42..4.2.43
- 16fbcfe
- [maven-release-plugin] prepare release 4.2.43
- 2807958
- dcap: restart pool selection on OUT-OF-DATE error
- ea7a0cf
- gplazma-ldap: avoid thread leak by explicitly close NamingEnumeration
- e8c88f7
- httpd: escape status field in HttpPoolMgrEngineV3
- 291213c
- [maven-release-plugin] prepare for next development iteration
Release 4.2.42
srm
A new user community requires the srm tools to be able to handle very large file listings. During preliminary tests, OutOfMemory errors from the srmls tool were observed. This is now fixed and srm can now support operations on very large file lists without running out of memory.
webdav
The current release added allow header to list of response headers for OPTION method request.
Changelog 4.2.41..4.2.42
- f610c7f
- [maven-release-plugin] prepare release 4.2.42
- 1f7bac2
- webdav: add allow header to OPTION method request
- f2bd61c
- srm: Remove JVM memory limits
- 36fc58d
- [maven-release-plugin] prepare for next development iteration
Release 4.2.41
common
The current release fixed formatting of error message in Checksum.
frontend
Admins may now configure frontend to specify in which country (or countries) data may be stored. This information is visible through dCacheView.
Changelog 4.2.40..4.2.41
- e9ee03b
- [maven-release-plugin] prepare release 4.2.41
- 4a4bc96
- frontend: make geographic placement configurable
- 0eb4352
- common: fix formatting of error message in Checksum
- a0af8d7
- [maven-release-plugin] prepare for next development iteration
Release 4.2.40
nfs
NPE on “show transfers” command is now fixed.
webdav
The current release fixed CORS for WebDAV doors that do not allow anonymous access; in particular, to support dCacheView uploading and downloading files with such authentication-required WebDAV doors.
Changelog 4.2.39..4.2.40
- 10130fe
- [maven-release-plugin] prepare release 4.2.40
- 9f23dc0
- nfs: fix NPE on “show transfers” command
- 65db4f8
- webdav: fix cross origin resources sharing issue
- 11ec1a7b
- [maven-release-plugin] prepare for next development iteration
Release 4.2.39
chimera
The shell infrastructure supports commands being given interactively, on
the commandline (e.g., ‘chimera mkdir /path/to/dir’) and from stdin
(e.g., ‘echo “mkdir /path/to/dir” | chimera’). chimera
now supports the latter
case and properly shows command output when invoked in that fashion.
frontend
This release updates dCache View to 1.5.5.
webdav
A client may issue a PUT request that targets an existing collection resource; i.e., attempt to write a file as a path that is a directory. dCache, until now, responded with an incorrect status code of 500. This release changes the status code for this operation to 405 (Method not allowed), thus keeping closer to RFC 4918.
xrootd
This release improves compatibility with the xrdcp client in versions >4.9 by responding correctly to query strings requesting a specific checksum type.
Changelog 4.2.38..4.2.39
- 2e397312d8
- [maven-release-plugin] prepare release 4.2.39
- 75f6f709c9
- dcache, frontend: release dcache-view version 1.5.5
- e9e8116ca9
- chimera: chimera shell should show output when commands come from stdin.
- 71d073c448
- webdav: return 405 status code for PUT requests targeting collections
- 486ed07a35
- dcache-xrootd: add checksum cgi handling to door query
- d9e11ce6f7
- [maven-release-plugin] prepare for next development iteration
Release 4.2.38
many
The dcache pool ls
command now provides correct output even if the pool is
defined with a single-digit number of bytes.
Changelog 4.2.37..4.2.38
- 370c33dffe
- [maven-release-plugin] prepare release 4.2.38
- 88e0c7a3d8
- scripts: avoid copy-n-paste error when calculating pool size
- df2b7a31ce
- [maven-release-plugin] prepare for next development iteration
Release 4.2.37
frontend
This release updates the dCache View web GUI to version 1.5.4.
ftp
HAProxy can probe endpoints to discover if they are still alive.
The FTP door has an optimisation that detects such probes and does not create the FTP command interpreter, since the FTP client (the HA-Proxy instance) is calling on behalf of itself, and will not issue any FTP requests.
This release fixes a regression that would cause erroneous NullPointerExceptions when FTP doors were probed by HAProxy.
pool
The default value for the xrootd Third-Party Copying server response timeout,
pool.mover.xrootd.tpc-server-response-timeout
, was increased from 2 to 30 seconds
to provide more robust behaviour in the face of high loads and network congestion.
transfermanager
Error messages like the WebDAV door’s
Failed to fetch information for progress marker: failed to query pool: (0) Job not found : Job-1
where the TransferManager is unable to discover the current status of the pool mover now include the pool’s name, which should make debugging easier.
Changelog 4.2.36..4.2.37
- 0db06e6a83
- [maven-release-plugin] prepare release 4.2.37
- 2a7152f117
- dcache, frontend: release dcache-view version 1.5.4
- 953613f359
- pools: make the xrootd tpc response timeout less aggressive
- 4ddcc839f9
- transfermanager: include pool name in error for ‘mover ls’ failures
- 0f51166f6e
- ftp: avoid NPE on HA-Proxy probes
- 3926c2e26f
- [maven-release-plugin] prepare for next development iteration
- de6d4fa3c9
- core: fix pool selection in killAll command of TransferManager
- 26cb9c29e2
- libs: update aspectj to java11 friendly version 1.9.2
- f5269d51cf
- zookeeper: remove ZooKeeperConnectionExceptionAspect
Release 4.2.36
Changes affecting multiple services
This release includes an updated Jetty library, with the update adressing CVE–2019–10247.
This release includes an updated Jetty library, with the update adressing CVE–2019–10247.
dcap
The Kafka messaging implementation in the dcap service has been made more robust, fixing issue [#4831](https://github.com/dCache/dcache/issues/4831).
frontend
Periodic activity associated with the frontend door is now logged with the door’s cell name. Such messages will also appear in the door’s pinboard.
nfs
Periodic activity associated with the NFS door is now logged with the door’s cell name. Such messages will also appear in the door’s pinboard.
pool
Attempting to start a full checksum scan (with csm check *
) while an
existing scan is still running is no longer reported as a bug.
Pool start-up logging now includes the corresponding pool cell name.
An internal timing check was updated, which should result in more robust pool behaviour. There should be no user-visible impact.
webdav
Periodic activity associated with the WebDAV door is now logged with the door’s cell name. Such messages will also appear in the door’s pinboard.
xrootd
A new configuration property, pool.mover.xrootd.tpc-server-response-timeout
, allows
setting a timeout for xrootd 3rd party copy operations. This can also be changed through
the new admin command xrootd set server response timeout
.
Changelog 4.2.35..4.2.36
- 81b97a7e0b
- [maven-release-plugin] prepare release 4.2.36
- 04c8ada79e
- pool: avoid IllegalStateException in ‘csm check *’ command
- dfe3670ee9
- dcap: fix premature close of kafka sender
- e234b45be3
- sweeper: compute now after the values have been fetched
- 236baac111
- libs: use jetty 9.4.18.v20190429
- aca121207b
- dcache-xrootd: add ability to override default timeout for server response (TPC)
- 03998dbb3e
- [maven-release-plugin] prepare for next development iteration
- 91b1a9bf63
- frontend: include CDC in scheduled activity
- c9c77f3a3e
- nfs: include CDC in scheduled activity
- 62ccc626f0
- webdav: include CDC in scheduled activity
- e0f853b011
- pool: ensure initialisation thread has correct CDC information
- df3fae5f4a
- jetty: make CanlContextFactory subclass of jetty.ssl.SslContextFactory.Server
- 7b40be97a6
- pom: use jetty 9.4.17.v20190418
Release 4.2.34
alarms
To ease troubleshooting, the POOL_DEAD alarm message now includes the pool name.
pinmanager
A bug was fixed where PinManager’s bulk ls
admin command yielded a
NullPointerException if the optional argument was omitted.
A typo prevented the error message “Remote connection failure while unpinning…” from appearing completely and correctly in the logs. The error message string now contains the message string of the underlying Exception, hopefully providing helpful details for troubleshooting.
pool
A regression that prevented a replica’s last access time from being updated was fixed.
A regression that prevented a replica’s position in the LRU queue for garbage collection from being updated was fixed.
webdav
Users asserting the “admin” role would occasionally receive NullPointerExceptions when trying to transfer files through WebDAV. This release fixes that issue.
Changelog 4.2.33..4.2.34
- 8db1a503f6
- [maven-release-plugin] prepare release 4.2.34
- 8cfe293334
- UnpinProcessor: fix assumed typo
{)
- 5541899529
- webdav: allow transfers as user with role ‘admin’
- aa9f997c88
- pinmanager: avoid NPE if no argument given for ‘bulk ls’ command
- 5f9d5e9f49
- alarms: add pool name to POOL_DEAD alarm
- a1131deafd
- pool: fix reordering of removable replicas on access
- 82e929b623
- pool: fix storage of replica last access time
- 4ae111d210
- [maven-release-plugin] prepare for next development iteration
Release 4.2.33
Changes affecting multiple services
Stage request from unknown locations resulted in NPE in dcap and pinmanager services.
this is now fixed and using dccp
to stage a file should work even if the location is unknown. ‘–’
frontend
A client that disconnects and quickly reconnects could had triggered the following NPE, this is now fixed.
resilience
The current release fixed race condition on replica state and no inaccessible file errors
occures for a newly written file.
Changelog 4.2.32..4.2.33
- bd37754
- [maven-release-plugin] prepare release 4.2.33
- d9be5d9
- dcap/pinmanager: stage request for unknown location results in NPE
- 35b65fe
- dcap batch : fix handling of dcap.kafka.topic variable
- 31c8a00
- dcache-resilience (stable branches): fix race condition on replica state
- 20a82c0
- frontend: fix race on client reconnecting
- ffd1254
- [maven-release-plugin] prepare for next development iteration
- 200e2f1
- resilience: adjust synchronization of file operation removal from map
Release 4.2.32
frontend
The destroy sequence for a channel failed to obtain the Channel monitor. This could result in concurrent changes to the Channel’s state and incomplete Channel shutdown. This is now fixed and the shutdown sequence of a channel is more robust.
pool
An unhelpful error message “Parameter directory
is not a directory” is
replaced with one that provides information on which directory is
missing.
The pool no longer logs configuration or deployment problems that prevent the pool from creating a mover as if that problem was a bug.
The current release fixed certain error cases, where a pool is unable to create a mover are no longer logged as a bug in dCache.
transfermanager
By default, the transfermanager will retry starting the mover ten times before giving up on the pool. If we know that a pool doesn’t support this transfer type (for whatever reason), then this makes no sense. This is now fixed and a pool that does not support a particular transfer is not immediately retried.
The current release fixed a NPE if transfer was cancelled.
Changelog 4.2.31..4.2.32
- d0b7b69
- [maven-release-plugin] prepare release 4.2.32
- bc59ba9
- pool: avoid throwing a RuntimeException for non-bugs
- 62be882
- transfermanager: do not retry starting mover if transfer is not supported
- 1666d8b
- pool: avoid log-and-throw anti-pattern
- 178aeb2
- frontend: ensure client is disconnected when shutting down channel
- 1964c2e
- frontend: avoid race on cancelling channel garbage-collect task
- cf9390b
- transfermanager: avoid NPE on shutdown
- ac4019f
- pool: throw exception with meaningful error message
- 4ef86a4
- [maven-release-plugin] prepare for next development iteration
Release 4.2.31
pool
The current release improved time formatting for Json mover info.
Changelog 4.2.30..4.2.31
- 6cdf4bf
- [maven-release-plugin] prepare release 4.2.31
- fc4abea
- pools: JSON mover info timeInSeconds should be timeInMilliseconds
- b74c390
- [maven-release-plugin] prepare for next development iteration
Release 4.2.30
dcache
Jetty version is updated to 9.4.12.v20180830.
nfs
nfs4j version is changed to 0.17.11 with fixed export table evaluation order.
resilience
It is now possible to record resilience activity (on the receiving end), which may prove useful in understanding behaviour.
In rare circumstances dark removes can result in data loss by removing of all replicas for a given file. The current release fixed the issue.
Pool operations now can successfully be restarted from the command line after they have been shutdown, without restarting resilience.
Changelog 4.2.29..4.2.30
- 4e3a9f7
- [maven-release-plugin] prepare release 4.2.30
- 0ba3158
- resilience: update state on pool operations when restarted from admin command
- f17163f
- libs: use nfs4j–0.17.11
- 060e570
- chimera-shell: fix class cast of extractor in constructor
- 2b208b9
- libs: update jetty version to 9.4.12.v20180830
- 87a2fbe
- nfs: fix missing CDC initialization
- 584cede
- resilience: do simple existence check of replica on pool to avoid dark removes
- cf1a0d9
- systemtest: remove ancient replica pools
- ee2ec0e
- [maven-release-plugin] prepare for next development iteration
- 98e3198
- resilience: add ability to log resilience activity
Release 4.2.29
dcache
rados4j version is updated which contains bugfix addressing data corruption on write with HTTP.
kafka
The current release added ability to specify Kafka receiving topic name: dcache.kafka.topic=billing
.
pool
The current release added performance boost for ceph pool.
resilience
It is now possible to record resilience activity (on the receiving end).
webdav
The current release fixed resource name for door root error.
Changelog 4.2.28..4.2.29
- 8a0c017
- [maven-release-plugin] prepare release 4.2.29
- aafa409
- pool: grow file prior HTTP TPC
- 98bb631
- pom: use rados4j–0.0.4
- d49df53
- resilience: add ability to log resilience activity (incoming)
- e8ac90b
- kafka: add ability to specify Kafka topic name
- 8ba03f6
- webdav: fix resource name for door root
- 0d7bb40
- [maven-release-plugin] prepare for next development iteration
Release 4.2.28
ftp
Now clients can request the checksum value of a file not owned by that user and where dCache does not already know the checksum value.
pool
The current release fixed some logging on the pool where messages were recorded against an arbitrary context (i.e., the bit in square brackets), resulting in misleading information.
transfermanagers
TransferManager failed a transfer if the pool reports any problem when requesting the transfer was started. This is now fixed and third-party transfer is more robust against non-fatal errors that occur normally on a busy system.
Changelog 4.2.27..4.2.28
- 9ddb29e
- [maven-release-plugin] prepare release 4.2.28
- 367c012
- transfermanagers: recover from non-fatal error starting mover
- 7353b40
- pool: fix CDC for repository listener notification
- e81337b
- ftp: store calculated checksum using root privileges
- 3e97b6c
- [maven-release-plugin] prepare for next development iteration
Release 4.2.27
gplazma
Due to a formatting issue, certificates issued by InCommon CA were not accepted because of case mismatches between the supplied “postalCode” and “street” and the expected “PostalCode” and “STREET” field descriptors.
Since this release, dCache handles those certificates as expected despite the unusual naming convention.
pool
There was an issue with the timestamps output by the sweeper ls
command. The command lists replica creation time and the time of
the last access. The output for replica creation time was incorrect,
as it showed the startup timestamp for the replica’s pool if that
was more recent than the actual creation time of the replica.
This issue was fixed, and the timestamps are showing the correct values now.
webdav
When users request a macaroon via an HTTP POST request targeting a specific path, a caveat is created that restricts the macaroon to that path (requests to / result in a non-limited macaroon).
Commit 99c726e3 resulted in users getting back a non-limited macaroon for every request. This issue was fixed with this release.
Changelog 4.2.26..4.2.27
- 0f9417c155
- [maven-release-plugin] prepare release 4.2.27
- e5de51d28d
- systemtest: fix OpenSSL DN format change
- d3be93e958
- webdav: fix path-to-caveat for macaroon minting endpoint
- eefd8ee5f3
- gplazma gridmap plugin: compare DNs ignoring letter case for attribute names
- c9cf8f7670
- webdav: fix NPE when Kafka notified file deletion
- 9069ab99a3
- [maven-release-plugin] prepare for next development iteration
- 121a95ccd8
- pool: report correct replica creation time to sweeper
Release 4.2.26
admin
he current support added direct command execution capability and it works with support for semicolon (;) separated list of commands like so:
ssh -p <port> user@example.com "command1; command2; command3"
.
alarms
Pool errors involving a fatal repository fault, for instance, can be sent now as an email alarm without having to send all pool disabled alarms.
billing
Better formating for storageInfo key, when events send with kafka producer.
dcache-view
Troubles when using firefox and/or safari to browse dcache-view were reported. This is now fixed.
gplazma
The JAAS gplazma plugin no longer logs a stacktrace on bad configuration.
pool
The current release improved error messages making them clearer by avoid using the same error message in multiple places.
The current release boosts performance for ceph pool.
The current release fixed lookup for canonical hostname for IPv6 addresses and now secure HTTP transfers work over IPv6 and problems are easier to diagnose.
The current release improved error message (previously “Could not create mover”) to provide more information about why the mover could not be created.
srm
A stack-traces fixed for concurrent updates in pin-manager and similar (expected) failures.
transfermanager
When transfermanager returns an error to the caller (e.g., WebDAV) if there was a problem starting the mover. This message did not include any details describing on which pool this failure occured. This is now fixed and HTTP TPC failures in which the pool does not start the mover now include the pool’s address in the error message. This allows admins to investigate further.
Now Third-party transfers fail if the client is requesting to copy a file from dCache that has not fully been uploaded.
webdav
Disabling basic authn should not now disable macaroons. The current release fixed webdav.authn.basic
and frontend.authn.basic
so that setting
these configuration properties to false
no longer blocks macaroons from being accepted in the HTTP Authorization header.
The current release impoved error messaging for unauthenticated request.
IllegalArgumentException exception is fixed now and attempts by a client to copy a file that has not fully been uploaded results in a clear error response.
The current release added switch to reject macacroons sent unencrypted and now following security recommendations sites may configure dCache to reject any macaroons send over an unencrypted channel. The default behaviour is to continue accepting macaroons sent over an unencrypted channel to avoid breaking existing deployment.
Changelog 4.2.25..4.2.26
- 5bdc686
- [maven-release-plugin] prepare release 4.2.26
- 5121043
- webdav/frontend: add switch to reject macaroons sent unencrypted
- 4ad054e
- webdav/frontend: disabling basic authn should not disable macaroons
- 82264f9
- srm: do not log a stack-trace on expected Exception errors
- 790ee1f
- transfermanager: fail third-party copy if the file is still being uploaded
- 7da9884
- webdav: fail COPY early if file is currently being uploaded
- 60f0ba2
- transfermanager: abort transfer if there is a bug
- 0f26643
- gplazma: JAAS plugin logs a stack-trace on misconfiguration
- 8713437
- transfermanager: include pool address in the mover start failure message
- 98bfa14
- pool: update error messages to make them distinct
- 9ed620c
- pool: avoid using the same error message in multiple places
- d80db22
- alarms: add pool dead alarm
- 24740b9
- pool: fix lookup for canonical hostname for IPv6 addresses
- 9e20b62
- admin : add direct command execution capability
- f3e9526
- pool: grow file prior FTP upload
- 11a9671
- [maven-release-plugin] prepare for next development iteration
- 5c8685e
- pool: don’t update atime on flush
- 81fc28d
- scripts: fix ‘dcache pool yaml’ command
- 1e277b4
- webdav: 401 for unauthenticated requests; message in status line
- a6d7daf
- dcache, frontend: release dcache-view version 1.5.3
- c37e589
- door: fix issue 4551 (wring storage)
- 7cfa48a
- dcache: update kafka-client lib version to 2.1.0
Release 4.2.25
dcache-frontend
The current release added documentation concerning restores.
ftp
The current release fixed MLSC command for non-small directories and Globus is now able to list directories with > 100 directories.
xrootd4j
The current release updated the xrootd4j including the following fixes and improvements: add ERROR status to tpc info, change the protocol version to int, prevent NPE when constructing error response, fix path handling in move request, correctly handle multiple authn protocols as indicated by server, handle correctly IO/Security exceptions on credential loading and distinguish correctly between kXR_wait and kXR_waitresp.
Changelog 4.2.24..4.2.25
- 6723aa4
- [maven-release-plugin] prepare release 4.2.25
- e4d03d7
- ftp: fix MLSC command for non-small directories
- ffe9566
- dcache-xrootd: remove mv request hack
- 7625685
- dcache-frontend: add documentation concerning restores
- ba5d0d2
- pom.xml: update xrootd4j dependency to 3.3.4
- eedc6e1
- dcache-frontend: undefined suid parameter on transfers should be NULL not “null”
- d6fd146
- [maven-release-plugin] prepare for next development iteration
Release 4.2.24
billing
Database connection loss now is reported for billing.
libs
postgres-jdbc is updated to 42.2.5 resulting in better integration with postgres 10 and better support for java11.
nfs
The NFS door assumed that routable IP address, like 130.199.49.35, in general can’t access private subnet, like 10.1.1.1. This assumption was not always true for all sites and ended up with non functional pNFS deployment. This is now fixed.
webdav
The current release fixed the problem where all but one requests fail, if multiple concurrent PUT requests have directories in the path that do not already exist.
zookeeper
The current release updated the lib version for zookeeper to 3.4.13 with minor version update with many bugfixes. See: https://zookeeper.apache.org/doc/r3.4.13/releasenotes.html.
Changelog 4.2.23..4.2.24
- add726d
- [maven-release-plugin] prepare release 4.2.24
- 9f379b8
- nfs: do not filter device’s IP addresses based on site locality
- 02b0ed4
- dcache: wrap billing data source with AlarmEnabledDataSource
- b9dbf8b
- libs: use zookeeper–3.4.13
- 41231b3
- libs: use postgres-jdbc 42.2.5
- 25bb88d
- common: fix random data generation in TimeseriesHistogram unit test
- e8e5aa4
- webdav: work-around Milton racy API for creating collections
- f04def1
- webdav: fix name of root
- 517279f
- [maven-release-plugin] prepare for next development iteration
Release 4.2.23
dcache-view
The following new functionalities have been added for dCache View: Now it is possible to use macaroon for file sharing. Files can be shared by sending the generated link, QR code or macaroon for the files to the person you want to give access to your files. Gravatar request is now make optional and how the images are stored are now more efficient to reduce the number of request made.
gplazma
Since update to newer CANL and voms-java-api libraries sites report VOMS certificate validation errors like so:
[[canlError]:CAnL certificate validation error: Signature of a CRL corresponding to this certificates CA is invalid, [invalidAcCert]:LSC validation failed: AA certificate chain embedded in the VOMS AC failed certificate validation!, [aaCertNotFound]:AC signature verification failure: no valid VOMS server credential found.]
This is a consequence of not refreshing certificates from trust anchor
directory in gPlazma voms plugin. The refresh is enabled by passing a regresh interval
option when setting up voms validator. The patch
fab850d
adds two variables gplazma.vomsdir.refresh-interval
and gplazma.vomsdir.refresh-interval.unit
that control refresh interval:
gplazma.vomsdir.refresh-interval = 4
(one-of?MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS)\
gplazma.vomsdir.refresh-interval.unit = HOURS
Without this fix voms plugin fails to validate voms certificates rendering dCache non-operational. Therefore sites runing dCache releases 4.2.15–4.2.22 are strongly encouraged to upgrade.
nfs
The current release fixed the bug introduced by ByteBuffer#limit
, which is used instead of
Buffer#limit
.
srm
The dcache ports
command now includes the srm’s TLS/SSL interface.
Changelog 4.2.22..4.2.23
- 1eb4f14
- [maven-release-plugin] prepare release 4.2.23
- fc0cb33
- pom: use nfs4j–0.17.10
- fab850d
- gplazma voms plugin: add trust anchor refresh paramater
- e446b45
- srm: include TLS/SSL port in ‘dcache ports’ command
- e5f6959
- dcache, frontend: release dcache-view version 1.5.1
- 228bfb7
- [maven-release-plugin] prepare for next development iteration
Release 4.2.22
Changes affecting multiple services
The current release corrected the properties for access-log.
nfs
New nfs4j version 0.17.9 with improve performance for file system locks is release now.
Changelog 4.2.21..4.2.22
- 55085b5
- [maven-release-plugin] prepare release 4.2.22
- ace5d36
- libs: use nfs4j–0.17.9
- 933a628
- correct the properties for access-log
- 3666c72
- [maven-release-plugin] prepare for next development iteration
Release 4.2.21
Changes affecting multiple services
If a client specifies a checksum value with either a WebDAV or FTP upload,
a Restriction check by-passed due to missing path
warning was logged occasionally.
This was fixed now, ensuring that restrictions are always applied.
gplazma
In situations where a user’s numerical UID is already known, but other attributes
are still needed, gPlazma can now query an LDAP server using the uidNumer attribute.
The property gplazma.ldap.try-uid-mapping=true|false
(defaulting to false) in the
gplazma.properties
file to control this new feature.
nfs
The NFS property nfs.idmap.manage-gids
was added to help overcome
a traditional 16-group limit for users: If a client uses AUTH_SYS and a user
has more than 15 subgroups, the NFS door will query gPlazma to discover additional
user groups.
pool
Space reservations on pools that are connected to tape showed a problem with failing restore requests: If a restore failed, the space that was reserved to hold the file that was supposed to come in from tape was not freed again but kept in the ‘sticky’ state. This resulted in lots of unusable space on pools that could only be reclaimed through a restart.
With the current release, this issue is fixed and space is freed as soon as possible after a failed restore request.
resilience
A very rare race-condition is fixed where a failed upload results in resilience recording a stack-trace.
webdav
An issue with the Milton WebDAV library prevented Partial (or vector-read) GET requests from succeeding. This was fixed now through both an update of the dependency and a local patch while we wait for the proposed fix to be included upstream.
Changelog 4.2.20..4.2.21
- 4b0eab45b0
- [maven-release-plugin] prepare release 4.2.21
- cd3d1bc8ba
- webdav: fix proxied partial (vector-read) GET requests
- 30c15824b0
- nfs: overcome the 16 group limit of AUTH_SYS
- e32bca8d3c
- pool: fix pool space accounting on failed restores
- e8d6ee19b3
- resilience: fix NPE if file unlinked when resilience processes a broken file
- 5d3864a423
- ldap: search user by uidNumber attribute if only UidPrincipal is provided
- 870fffd119
- ftp/webdav: fix bypass of restrictions
- b27b6a26fb
- [maven-release-plugin] prepare for next development iteration
Release 4.2.20
alarms
An internal issue with the alarms configuration was fixed, which should prevent a rare NullPointerException from occuring.
dcap
Creating a file or directory using the DCAP protocol with a URL as parameter, the file permissions were not set correctly.
With the current release, this was corrected, and such files use the client-supplied file permissions. If none are provided, the default modes 0700 (for directories) and 0600 (for files) are used.
nfs
When a pool is decomissioned, clients can now be notified about
the need to terminate any remaining connections. The
pool reset id
command was updated to issue a device change
notification callback.
xrootd
Interoperability with the 4.9.x client series was improved.
An uncaught exception in xrootd doors was fixed.
Changelog 4.2.19..4.2.20
- 85b441434f
- [maven-release-plugin] prepare release 4.2.20
- 3ac7332249
- dcache-xrootd: add missing query support for tpc on pools
- d4ed957a18
- alarms: fix persistence.xml configuration
- f4154f7f0c
- dcap: fix permission propagation with DCAP
- f8fa92a4ae
- dcache-xrootd: handle possible race condition in directory listing
- 7f5a40d021
- nfs41: update
reset pool
command to issue cb device notify - 3d6fc445f0
- [maven-release-plugin] prepare for next development iteration
Release 4.2.19
Changes affecting multiple services
Jacco verison is updated so that the code coverage is now possible with java11 runtime.
libs
The current release modified the nfs4j version, which contains bugfix and enhancements, such as fix layoutget usage with current stateid and respectively.
nfs
When door performs an async pool selection this process should be blocked. However, we should block when request is performed the very first time to handle busy system and slow networks. This is now fixed and a client is not blocked if door knows that the request is not processed yet.
statistics
Metadata merge was using max when it should had used min, this is now fixed.
Changelog 4.2.18..4.2.19
- b816e01
- [maven-release-plugin] prepare release 4.2.19
- d8e9ac4
- libs: use nfs4j–0.17.8
- baa4c0d
- pom: use jacoco with java11 support
- 0ab1148
- common: fix histogram metadata merge
- b0f6ae5
- dcache-frontend,dcache-history: revisit NPE fix
- 85f8957
- nfs4: only block pool selection on the first attempt
- 61afa20
- [maven-release-plugin] prepare for next development iteration
Release 4.2.18
dcache-view
The current release added several bug fixes and improvement to dCache view.
gplazma
The voms plugin now includes the user’s DN in the logged error message if it cannot validate the VO-membership information.
nfs
It is possible now to reset pool selection task if pool disabled before redirect after receiving start mover.
A stack trace on remove of a missing file is fixed.
pool
The current release fixed failing HTTPS redirected transfers.
Download and upload via HTTPS in dCacheView required the pool to support CORS. This is now fixed and dCacheView upload and download now succeeds.
Changelog 4.2.17..4.2.18
- 3a463d7
- [maven-release-plugin] prepare release 4.2.18
- fd3a263
- restful: Return not Found/404 for non existing pool.
- 0323a84
- nfs: reset pool selection task if pool disabled before redirect
- 4015559
- dcache: release dcache-view version 1.5.1
- f573141
- gplazma2: Log credential information on x509 cert. chain validation and FQAN extraction failures.
- cf01d6a
- nfs: ignore JdbcFs errors when constructing acceess log entries
- ce71873
- libs: update nfs4j to version 0.17.7
- 0d11f86
- nfs: handle chimera exception on remove of a missing file
- 9859ca0
- pool: add CORS support for HTTP requests
- 1fe6c73
- pool: use hostname in HTTPS redirection URL
- 37da2fd
- [maven-release-plugin] prepare for next development iteration
- 6d5d9f1
- dcache: release dcache-view version 1.5.0
Release 4.2.17
ftp
Leaking server sockets were observed when a client aborted a proxied transfers with kafka enabled. This is now fixed and No further server sockets leaked when a proxy is being used, kafka notification is enabled, and the client aborts the transfer.
chimera
Chimera shell is now able to find origin tags of a given name.
pool
The current release fixed stopwatch error and now IO-statistics collecting is more robust,
avoiding stack-traces with the message This stopwatch is already stopped
.
restful
No function name/operationIds clashes for the swagger generated clients.
srm
Clients that use the gridsite protocol, such as davix, can now delegate their credential.
Changelog 4.2.16..4.2.17
- 23c0d9f
- [maven-release-plugin] prepare release 4.2.17
- daf27f6
- dcap: fix NullPointerException:
- 1178f8f
- restful: Rename operations to have unique operationId.
- 7494641
- ftp: avoid kafka bug, make shutdown more robust
- 3ad364e
- dcache-history,dcache-frontend: check for serialized error when handling pool data request messages
- 272c627
- pool: fix stopwatch error
- 6ac21a4
- common: fix bug in CountingHistogram index computation
- 2ef9022
- libs: update to nfs4j–0.17.6
- eb161be
- [maven-release-plugin] prepare for next development iteration
- 45c992d
- srm: gridsite fix querying validity of delegated credential
- 549ce6f
- chimera: support origin tag discovery
Release 4.2.16
frontend
A new property, frontend.authz.unlimited-operation-visibility
, in the frontend now controls visibility
of operations exposing file metadata. The default
is false, meaning non-admin users can only see
file operations for files which they own or which
are anonymous. Setting it to true allows everyone
access.
ftp
The behaviour of FTP transfers was made more robust in cases where a client disconnects from the control channel prematurely.
The performance markers that dCache sends back to the client in FTP transfers are now more robust against bugs.
nfs
When transient errors in pools cause NFS transfers to have to wait and retry, the system’s behaviour is now more robust and no StackOverflowErrors should be logged any more.
scripts
Maven’s findbugs plugin is now granted more working memory in order to make builds, especially on our continuous integration system, more robust.
srm
Certificate lifetime considerations for VOMS proxy certificates are improved in this release: if a client delegates a credential where the VOMS AC expires before the X.509 proxies, dCache now will not use the credential beyond the AC expiry time. This avoids unnecessary authentication errors.
webdav
When the WebDAV door is considering an HTTP third-party-copy request that uses grid-site delegation, there is a minimum 20 minute validity that any existing delegated credential must satisfy. If this is not satisfied then dCache will request a fresh delegated credential.
Until now, if the client failed to delegate a fresh certificate then the subsequent COPY request was rejected. This release changes that behaviour and enables such transfers.
Changelog 4.2.15..4.2.16
- c968d5e646
- [maven-release-plugin] prepare release 4.2.16
- 597047c4a5
- nfs: increase request retry delay when selecting/starting pool or mover
- e1f93b74bc
- dcache-frontend: provide switch to control visibility of file operations for non-admin users
- fabb8044d8
- webdav: adjust minimum validity after requesting delegation
- 935e4825bb
- srmmanager/webdav: consider VOMS AC validity of delegated credential
- b1e41c1bc3
- ftp: make performance marker task robust.
- 903847efb5
- ftp: avoid NullPointerException if adapter is not connected
- 296fecec02
- [maven-release-plugin] prepare for next development iteration
- 8d1dcf5cc8
- scripts: Avoid findbugs memory errors
Release 4.2.15
libs
This release includes updated library versions Bouncy Castle 1.54, CANL 2.5.0 and voms-api-java 3.3.0 which, in addition to the usual security updates, should help resolve the “pad block corrupted” issue that a few deployments showed with xrootd transfers.
pool
Diagnostic logging for failed HTTP third-party transfers was improved.
Billing records for failed transfers now show more detailed information.
The handling of cancelled flush requests for nearline media was rewritten to be more efficient. This resolves issues where pools report “Flush of 0000… failed with: CacheException” followed by “Pool restart required: Internal repository error”.
Compatibility with DPM was improved by increasing HTTP GET requests’ timeouts. This should allow more transfers to succeed.
poolmanager
Supplying poolmanager with an unresolvable hostname as the target will now result in an UnknownHostException instead of the previous behaviour where an (unnecessary) NullPointerException was thrown.
scripts
The format and content of the Storage Description JSON file have been updated according to WLCG suggestions:
- ‘capacity_id’ field is renamed to ‘name’
- ‘total_space’ and ‘used_space’ renamed to ‘totalsize’ and ‘usedsize’ respectively.
- ‘timestamp’ field added
- ‘vos’ field added
- ‘assignedendpoints’ added. Currently hardcoded to “all”.
srm
Logging of errors in the SRM credential store was improved.
webdav
If a non-resolvable host name is given as the source or destination of a third-party copy request, WebDAV will now fail the transfer immediately instead of waiting for a Poolmanager timeout.
Diagnostic logging for failed HTTP third-party transfers was improved.
xrootd
dCache allows xrootd clients to specify a query/opaque string in a kXR_mv request’s source path.
Changelog 4.2.14..4.2.15
- 0d84527d7c
- [maven-release-plugin] prepare release 4.2.15
- 09543a34bb
- pool: HTTP TPC rework exception logging
- 4220befbbc
- pool: increase TPC socket timeout for GET requests
- 0ef9014b0b
- srm: fix credential store logging
- ae6cc5f15d
- pool: update log status using exception class name if no message
- c4262b3b1c
- storagedescriptor: update information based on WLCG feedback
- 089d7335ef
- xrootd: strip off query part from kXR_mv source
- 4a5473b86e
- webdav: fail TPC request early on unknown hostname
- 9ecfc52ce9
- nearline-provider: do not propagate thread interrupt flag
- cbc1223b39
- poolmanager: fix NPE on unknown host
- 61a3174bd7
- webdav: improve logging of TPC requests
- eafbd5c1b1
- libs: upgrade to bouncycastle 1.54, CANL to 2.5.0 and voms-api-java to 3.3.0
- 3a021a49e8
- [maven-release-plugin] prepare for next development iteration
Release 4.2.14
Changes affecting multiple services
In order to more easily identify a rejected macaroon in the logs, its ID is now included in the log message.
After discussing an issue observed in a production system at DESY, we decided to revert this commit until a more thorough analysis was done. The original intent of this change was to resolve rare “pad block corrupted” issues with xrootd transfers.
xrootd now shows IP addresses instead of (potentially ambiguous) hostnames as transfer endpoints.
An irrelevant stacktrace was logged on unexpected CacheExceptions. This was removed, leading to less clutter in the logs.
Different macaroons that were issued against the same secret are now discernible in the logs.
Users now get more information about the reasons why an invalid macaroon was rejected: HTTP requests that are made with an invalid macaroon have a 401 HTTP response with the status-line explanation phrase that describes why the macaroon is invalid.
The access log file also logs why a macaroon was rejected.
core
A library dependency was updated to avoid CVE–2018–11771. This patch introduces no user-visible changes.
frontend
Github issue #4242 was resolved; cell information can now be be gathered using the REST interface without specifying domains.
gplazma
Invalid macaroon logins no longer “spam” gPlazma.
Group Identifier numbers (GIDs) can now freely be chosen from within the numerical range defined by Java’s long data type, up to 264–1.
httpd
Transfer information was improved: A call to hostname.tld:2288/context/transfers.json now also includes path info for transfers.
pnfsmanager
When creating a macaroon to allow uploading of data, the desired path may not already exist. Without restrictions, WebDAV will auto-create parent directory items that are missing, or the client can create these directory elements explicitly with MKCOL.
With restrictions (such as from a macaroon) such directory creation currently requires the MANAGE activity, which allows other actions beyond the scope of this scenario. With this release, the behaviour was changed so that a user with a macaroon that authorises them to upload data into a particular directory will be able to create parent directories to achieve uploading the data.
pool
A regression caused pools that had their size only specified in a layout file to report a size of 8 Exabytes. This issue was fixed.
dCache now supports a DPM-specific HTTP extension that indicates the checksum calculation is not yet complete, avoiding potential data corruption with third-party copies: If DPM is calculating a checksum, then any RFC 3230 (i.e., with a ‘Want-Digest’ header) GET or HEAD request returns ‘202 Accepted’ respond status line and an HTML page as the response entity. Since dCache considers any 2xx response as success, the HTML page was previously accepted as the file’s contents, resulting in data corruption.
dCache pools no longer log a stack-trace for non-bug P2P failures.
srm
The domain ‘.access’ log file now contains log information for grid-site delegation activity, which facilitates debugging of http third-party-copying issues.
transfermanagers
The “restriction check by-passed” warning for each WebDAV-initiated third-party transfer is fixed.
webdav
A user may request a macaroon by making an HTTP POST request to the WebDAV door. This log entry was augmented by the ID and type of macaroon used.
A previous patch needed a bit of an update to ensure that X.509-with-FQAN authenticated third-party transfers with macaroons work under all circumstances. This is now ensured.
xrootd
A previous patch, committed on master as 6e90136, introduced xrootd properties to control the behaviour of the xrootd door when sending kafka events. This patch contained a typo in the default value. The result was dCache would not start with a kafka enabled xrootd door. This issue has been fixed.
Changelog 4.2.13..4.2.14
- d4eeef2c6f
- [maven-release-plugin] prepare release 4.2.14
- e3881d4517
- xrootd: fix broken configuration property
- 93adbb27cd
- Revert “packaging: use private BC 1.50 release that provides JSSE compatible handling of key agreement secret generation.”
- 1e99b2e16c
- gplazma: support large gid values for roles
- 403ffa3095
- pool: P2P failures trigger stack-trace
- b7dd724b2b
- webdav: obtain FQAN from X.509 credential for gridsite
- 0c5fa778e5
- core: avoid sending bad macaroons to gplazma
- 3e03f5085d
- webdav: update access log to record macaroon request details
- 21e168325b
- transfermanager: fix missing path
- f6fe0b2163
- ftp: java.lang.IllegalStateException: Cannot send after the producer is closed.
- 18ebdc50a3
- dcache: use getAddress for uniform client IPs in Transfer info
- 1db446d680
- libs: update to commons-compress–1.18
- c223e0afe9
- httpd: add path to context/transfers.json
- 2ca2edc9d5
- macaroons: include macaroon id in error message
- af0c14fc17
- pool: fix pool’s runtime configured size regression (b70b0d9)
- ee9685760a
- core: provide better feedback and logging if a macaroon is rejected
- e7416d7035
- dcache-frontend: fix array out of bounds exception in cell info service
- 910ed77ac4
- pool: update HTTP TPC to support retrying GET and HEAD requests for DPM
- bc15b8cfe2
- srm: add gridsite delegation interface access-log
- 60f5628b02
- macaroons: fix logged id
- e773fd08c2
- dcache-frontend: add path filter to transfers
- 8e35daa3c6
- core: avoid stacktrace on arbitrary CacheException
- 4b4cd464b2
- [maven-release-plugin] prepare for next development iteration
- a514557a15
- pnfsmanager: allow restricted user with UPLOAD to create parent directories
Release 4.2.13
Changes affecting multiple services
Through integration of an updated BouncyCastle cryptography library package, some rare errors where clients would report a “pad block corrupted” error are avoided.
frontend
When a user does not have the permission to read a file (or is simply not logged in), dCache would previously report a 500 Internal Server Error. This error reporting was improved, reporting 401 Unauthorized or 403 Forbidden as appropriate.
dCache has a hard-coded behaviour where a user providing bad authentication (e.g., wrong password, expired OIDC access-token or macaroon) is treated as the anonymous user.
This has proved counter-intuitive, as wrong/expired credentials often appear to succeed for some operations (e.g., directory listing), while failing others (upload/download).
A new configuration property,
dcache.enable.authn.anonymous-fallback-on-failed-login
, allows
changing that behaviour. The default setting of the property
does not change system behaviour.
ftp
If the ftp client requests a proxied passive transfer with a different IP family from the control channel (i.e., the client connects using IPv6 and requests an IPv4 data channel, or vice versa) the ftp server must select which IP address it should return to the client.
As pointed out by Francesco Prelz (thanks!), the door currently selects the first address from the same interface that has the desired IP family. However, this may not be accessible by the client.
This release updates address selection so that only usable addresses will be returned to the clients.
info
Clients querying the info service (such as info-provider and storage-report) are now informed of the number of files stored in a space reservation.
The info service now displays the time at which the information it displays was recorded.
nfs
Using the latest NFS4J library brings some performance improvements.
pool
Monitoring transfers is now a bit more comfortable as this release adds path information to a transfer’s information.
poolmanager
This release increases responsiveness for users that are not allowed to stage files, and for NFS users who access offline files. In cases where such a user issued a read request at the same time that Pool Manager handled a staging request, the first request would block for the duration of the staging – potentially quite a while. From now on, users that are not allowed to stage receive appropriate error messages as soon as possible, without having to wait for anyone else.
xrootd
The --zip
option of xrootd clients is now supported.
Support for xrootd mkdir
was improved.
Changelog 4.2.12..4.2.13
- fcc1adcb70
- [maven-release-plugin] prepare release 4.2.13
- 5bf1013356
- xrootd: remove spurious stack-trace
- 970434b225
- xrootd: add support for kXR_stat on open files
- 8892290b5a
- xrootd: update to xrootd4j dependency to 3.3.3
- 5c7597f335
- packaging: use private BC 1.50 release that provides JSSE compatible handling of key agreement secret generation.
- 1958003b5c
- dcache-frontend: fix error message for IdResource
- 05346f76d3
- ftp: better address selection for cross-family passive proxied transfers
- 9ddd28ac95
- webdav/frontend: make anonymous fallback on bad login optional
- 173de7a6bf
- info/space-manager: monitor number of files in reservation
- 2bba771c56
- info: display the timestamps when metrics were collected
- c09d13284a
- poolmanager: do not squash request if state is not allowed
- d7e1539492
- libs: use nfs4j–0.17.5
- 4d026acdb3
- [maven-release-plugin] prepare for next development iteration
- 547776f84b
- dcache: add path to transfer information
Release 4.2.12
Changes affecting multiple services
For dcap and ftp dorrs for each transfer was created new Kafka producer. As a result first Server was shutdown with the following IO exception java.io.IOException: Too many open files This is fixed now.
The previous for mat for the date
key in message was not easy to parse "date":"Mon Oct 01 13:49:00 CEST 2018"
.
This is now changed and the new
format, is "date":"2018-10-01T13:50:30.008+02:00"
is easier to parse.
frontend
All monitoring information was only available to admin role users. This was a change from the previous webadmin interface, where most information (except alarms and space token definition) had been available to all users. This probably was too restrictive and may disrupt some user patterns. The current release improved the access semantics which are more compatible with previous usage.
gplazma
The current release added gplazma2-fermi voggroup plugin supports wilcard match on user fqan.
sysytemd
Systemd did not inherite the system-wide limits and was completely ignoring /etc/security/limits.d/92-dcache.conf
.
This is now fixed and the limits successfully loaded and enabled as expected.
vehicles
The current release has fixed serialization regression in FileCorruptedCacheException
.
Changelog 4.2.11..4.2.12
- 1400080
- [maven-release-plugin] prepare release 4.2.12
- a47b547
- dcache-xrootd: add missing kafka property
- d9bb047
- vehicles: fix serialization regression in FileCorruptedCacheException
- d7c9480
- Update StorageInfoMessageSerializer.java
- 8c3ea5e
- dcache: Adjust Date formating of Timestamp value for
date
key for kafka producer - 75e5f9b
- dcache: Creating multiple KafkaProducer instances results in ‘Too many open files’
- 62624b0
- gplazma2-fermi plugin (vogroup plugin): allow for wildcard match of fqans
- b4bec04
- dcache-frontend: remove admin restrictions on GET and filter transfers on uid if not admin
- 83847a0
- [maven-release-plugin] prepare for next development iteration
- 7218c51
- systemd: Add
/etc/security/limits.d/92-dcache.conf
in the dcache systemd unit and generator.
Release 4.2.11
Changes affecting multiple services
This rlease fixes an issue with WebDAV 3rd-party-copy requests that are authorized using a macaroon that is only valid for writing a specific file.
NOTE: both the webdav door and transfermanagers must be updated before the fix is effective.
The timeout used by dCache when attempting to send a Kafka event is now adjustable via the configuration properties dcache.kafka.maximum-block and dcache.kafka.maximum-block.unit.
The default timeout for pools, and the xrootd, nfs and webdav doors is now non-zero. This should fix the problem of kafka events being lost under normal operational conditions.
NFS4J was updated to a newer version with two upstream bugfixes:
- compatibility with standard rpcbind clients
- allow regular users to change file’s owner to the same value
pool
In order to help with debugging issues with partial FTP transfers, dCache pools now are able to log considerable information about failed FTP transfers.
This is controlled by the new property pool.mover.ftp.enable.log-aborted-transfers
.
transfermanagers
An old table that was previously used for debugging and monitoring purposes was removed, leading to better performance of the transfermanagers service.
webdav
dCache can now transfer data with a remote site, authenticating with that remote site using a delegated X.509 credential, but authenticating locally with a macaroon.
xrootd
This release updates xrootd4j, which should help fix occasional “pad block corrupted” issues with older clients.
Changelog 4.2.10..4.2.11
- 43f7091ce7
- [maven-release-plugin] prepare release 4.2.11
- 818e38da53
- pom.xml: update to xrootd4j dependency to 3.3.1
- 2e991d0c5a
- webdav: use TLS credential directly for gridsite
- 4dea39e85d
- pool: instrument ftp mover to show partial transfers
- 6faca97336
- ftp: fix regression in unit-tests
- 64db762f89
- libs: update to nfs4j–0.17.4
- 3f4e24e3b1
- dcache: add configuration for the Kafka producer timeout
- 69bb883d19
- TransferManager: remove state history class and corresponding table responsible for storing request state changes. It is not used, but may grow rapidly in database.
- fe4d837004
- webdav+transfermanagers: support TPC pull with targeted macaroons
- 211713183e
- [maven-release-plugin] prepare for next development iteration
Release 4.2.10
gplazma-role
A new observer
role is defined in current release, which is a weaker role than “admin” is useful for according
read-only access to system or file information.
Changelog 4.2.9..4.2.10
- f82cfff
- [maven-release-plugin] prepare release 4.2.10
- 047fc4c
- gplazma-role: add observer role
- ba8579e
- [maven-release-plugin] prepare for next development iteration
Release 4.2.9
frontend
The current release fixed broken directory QoS reporting and now frontend now more accurately describes the QoS of directories; i.e., the QoS that newly written files will receive when written into this directory, assuming none of the targeted pools are volatile.
webdav
the macaroon creation with multiple path restrictions failed with a http error 500 and the error message. This is now fixed and the macaroon creation succeeds when multiple path restrictions are defined.
The current release improved error handling for PROPFIND request.
Changelog 4.2.8..4.2.9
- 114c2ca
- [maven-release-plugin] prepare release 4.2.9
- d47ab2b
- frontend: fix broken directory qos reporting
- 9a6aff8
- webdav: avoid throwing any exception when listing a directory for PROPFIND
- 013383a
- webdav/macaroon: Fix macaroon creation with multiple path restrictions.
- 5506a64
- [maven-release-plugin] prepare for next development iteration
Release 4.2.8
dcache-view
Several fixes has been implemented for dcache-view: openid connect redirect handling is fixed, fixed file download.
dcache-xroot
The current release added necessary gsi properties for tcp credentials.
These properties have to do with the third-party client on the pool.
Note: this is a temporary workaround for
a problem which will be solved in a more general fashion
For more details please check xrootd-gsi.properties
.
ftp
dCache now has the ability to log the current status of a transfer at the point the client decided to abort an FTP transfer. This should support a post mortem investigation on why a transfer was cancelled.
ftp,dcap
The current release added functionality to push billing events to Kafka Server when ftp.enable.kafka
and dcacp.enable.kafka
are enabled.
nfs
With the current release the timeout of pnfshandler is configurable and nfs door quicker recovers from situations, when a PnfsManager is not available.
poolmanager
Previously in dCache it was not possible stage files from tape. This is now fixed.
Changelog 4.2.7..4.2.8
- bc73677
- [maven-release-plugin] prepare release 4.2.8
- 2c2c3b8
- ftp: add ability to log client-aborted transfers
- 99ac792
- nfs: make timeout of pnfshandler configurable
- 23f8145
- poolmanager: fix staging files from tape
- df52050
- pool: fix NullPointerException
- 420d9c5
- ftp: fix NullPointerException
- 5875eb8
- dcap: clean code changes
- 9a99bbe
- ftp: add kafka to push messages
- 754ebd9
- dcap: enable possibility to push transfer events to Kafka
- 611a2b6
- dcache: release dcache-view version 1.4.5
- 329a998
- dcache-xrootd: add necessary gsi properties for tpc credentials
- bb27660
- [maven-release-plugin] prepare for next development iteration
Release 4.2.7
NFS
When two clients A and B operate on a file in quick succession, A opening the file and B deleting it before LAYOUTGET is called, dCache puts the transfer into the list of active transfers and returned NFS4ERR_NOENT. If a client tries to optimize the corresponding CLOSE call away, as some do, the entries are never removed from the list, effectively creating a leak.
This problem was fixed. Clients now receive an NFS4ERR_STALE message in those cases.
core
Certain transfer failures, such as attempting to use a space-reservation that has insufficient capacity, resulted in the door eventually reporting a time-out problem to the client.
A typical error message would resemble
Request to [>SpaceManager@local ... ] timed out.
This problem was traced to an internal misconfiguration of a messaging component and is fixed from this release onwards.
frontend
The reporting of a file’s QoS status in frontend was improved. Files that are being scheduled for moving to tape are now reported as ‘tape’ instead of ‘disk’.
pool
In some cases, storage info data was not included in messages issued during a pool flush. This caused an irrelevant NPE to be logged.
This problem was solved, and as a side effect of the fix, billing records now have the correct format when reporting on flushes:
08.24 15:24:41 [pool:dcache-lab002-A@dcache-lab002Domain:store] [00006BD12E8925744156AAE87641D4AF73BB,1362] [/] 100013 2 {10006:"Flush was cancelled."}
vs.
8.24 15:51:07 [pool:dcache-lab002-A@dcache-lab002Domain:store] [00005C1387649DD74E0491DFE9A98D97DC39,1362] [/] data:sla2@osm 100015 1 {10006:"Flush was cancelled."}
A bug was fixed that occasionally caused problems with the pools’ Berkeley DB. This could, for example, be triggered by removing files which were in a flush queue.
A typical error message was, e.g.
27 Aug 2018 12:09:33 (cat2_lhcbtape) [Frontend-dcacheview PoolDataRequest] Fault occurred in repository: Internal repository error. Pool restart required: : CacheExcept
ion(rc=204;msg=Meta data lookup failed and a pool restart is required: (JE 7.3.7) Environment must be closed, caused by: com.sleepycat.je.ThreadInterruptedException: En
vironment invalid because of previous exception: (JE 7.3.7) /space/lhcb/tape/pool/meta java.lang.InterruptedException THREAD_INTERRUPTED: InterruptedException may cause
incorrect internal state, unable to continue. Environment is invalid and must be closed.)
27 Aug 2018 12:09:33 (cat2_lhcbtape) [Frontend-dcacheview PoolDataRequest] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server,dead): Pool restart req
uired: Internal repository error
webdav
Web clients (such as web-browsers) make OPTIONS pre-flight requests to discover what they are allowed to do, according to the CORS standard.
Unfortunately, some web-browsers make the OPTIONS request without presenting any credentials. If the resource is within a protected directory then dCache currently fails the OPTIONS request.
This release introduces a new behaviour where such requests will always succeed, so that browser pre-flight requests are not hampered.
Changelog 4.2.6..4.2.7
- ea02df1ff1
- [maven-release-plugin] prepare release 4.2.7
- bc2fd01e8c
- nearline-provides: do not interrupt processing thread on cancel
- 1e69881794
- nfs41: invalidate open-state on layoutget if file is removed
- b1af2ffc4f
- pool: fix NPE on flush
- 9dd5c2ac90
- webdav: always respond to OPTIONS request
- 76dad0e1de
- core: ensure pool/poolmanager communication receives errors
- ef8da1a340
- frontend: add targetQoS for not-yet-flushed tape files
- b30b01fb9c
- [maven-release-plugin] prepare for next development iteration
- 121f2cf66d
- dcache: release dcache-view version 1.4.3
Release 4.2.6
dcache-xrootd
This release fixed xrootd third-party transfer billing records so that the door client is always the user connection and the billing/mover client the source or destination server. This allows one to see immediately whether an xrootd transfer was third-party (for two-party transfers, the two clients will be identical).
gplazma
The OidcAuthPlugin plugin was updated so that users whos op does not claim
name
, and does not claim given_name
nor
family_name
can use dCache.
pool
This release fixed the log stack-trace for queue
admin commands and now bad admin input for the following admin commands no longer results in a stack-trace being logged:
- queue activate
- queue activate class
- queue remove class
- queue suspend class
- queue resume class
- queue remove pnfsid
poolmanager
NPE is fixed when staging files back from tape and
poolmanager.enable.cache-hit-message
is true.
webdav
The current release updated default credential delegation for third-party copy so that now requesting a third-party copy using a macaroon does not trigger a failed attempt to OpenID-Connect delegation.
Changelog 4.2.5..4.2.6
- 2af14c4
- [maven-release-plugin] prepare release 4.2.6
- 611b43d
- poolmanager: fix NullPointerException when staging files and reporting hits
- f42afba
- gplazma: oidc fix FullNamePrincipal creation
- 53b1b48
- libs: update jetty to version 9.4.11
- 81f9ce5
- pool: ‘queue’ admin commands not the log stack-trace on bad arguments
- 52e13d7
- dcache-xrootd: fix third-party billing records
- 2dd2699
- ftp: fix scope of used pool stub
- f1d1d94
- webdav: update default credential delegation for third-party copy
- 4ffc7a5
- srmclient: update delegation client to support X509_CERT_DIR en.var.
- 7eca970
- [maven-release-plugin] prepare for next development iteration
Release 4.2.5
history
This release fixes a bug that could cause startup errors in the history service in the face of network errors.
many
Remote pool monitor would occasionally log stack traces from exceptions when a domain shut down due to an interrupt. This has been fixed, reducing the number of irrelevant log entries in such situations.
xrootd
A small bug in a third-party client which does not propagate errors occurring too quickly could cause error messages to not be displayed to the user. One such example would be an auth failure due to bad configuration, which would appear as a file size mismatch to the user. This issue has been worked around, and all errors should now be reported correctly.
Changelog 4.2.4..4.2.5
- 8bf8c1a5dd
- [maven-release-plugin] prepare release 4.2.5
- 280bb7122d
- dcache-xrootd: repair handling of delayed sync errors to client
- 02fbe1f2e8
- dcache-history: handle Gson syntax errors explicitly
- cdfab2a1f3
- cells: add handling of RemoteProxyFailureException nested InterruptedException to UncaughtException handler
- a8439d4354
- dcache: remove unused jndi initializers from httpd
- c633404d68
- [maven-release-plugin] prepare for next development iteration
Release 4.2.4
dcache-xrootd
The current release added support to third-party copy to and from dCache.
Changelog 4.2.3..4.2.4
- 11994c0
- [maven-release-plugin] prepare release 4.2.4
- b5a1d9e
- dcache-xrootd: add support for redirect handling during third-party-copy
- a1619fa
- dcache-xrootd: add third-party support to pool (dcache as destination)
- b7087ab
- dcache-xrootd: add support for third-party copy to door
- 381d969
- [maven-release-plugin] prepare for next development iteration
Release 4.2.3
nfs
dCache 4.2 now uses nfs4j version 0.17.3, which includes a bugfix that should help avoid some rarely observed deadlocks with current Linux clients.
Changelog 4.2.2..4.2.3
- 94a7a8a2a9
- [maven-release-plugin] prepare release 4.2.3
- f6508c3019
- pom: use nfs4j–0.17.3 bugfix version
- ac59cb3056
- [maven-release-plugin] prepare for next development iteration
Release 4.2.2
PNFS
pool
HTTP responses now contain more meaningful messages along with the HTTP response codes, instead of only just showing stock messages like “400 Bad request”.
webdav
When the WebDAV door proxies a transfer and a transfer failure occurs, the door previously always just reported “500 Internal Error”. This reporting is now improved, with any more detailed error messages from (possibly) other services taking precedence. For example, if a pool returns a 400 error code, thus complaining about the client’s request, this code is reported instead, which should help with diagnosing the error’s cause.
Changelog 4.2.1..4.2.2
- 7fdc1fab07
- [maven-release-plugin] prepare release 4.2.2
- 2fb72a7074
- webdav: pass on status message phrase to client
- f1ae38b5e1
- pool: update HTTP mover to report errors as HTTP status message phrase
- dfd2d85a11
- admin: fix regression in startup
- 7dbd5da6a2
- pnfsmanager: fix digest name handling in
get file checksum
command - 1be5d6a26d
- [maven-release-plugin] prepare for next development iteration
Release 4.2.1
frontend
The frontend now correctly handles situations where a transfer that is already completed is killed. Be aware that in order to make use of the bugfix, both the pools and the head nodes need to be updated to at least 4.1.9.
ftp
Since commit eefb964, 3rd-party-transfers using ftp between two dCache endpoints had an issue where connections were not reliably closed. This release fixes the problem.
pool
This release improves dCache’s robustness against network errors: In case registering a file with PNFS manager fails due to a timeout, the request is retried transparently.
resilience
Resilience suffered from a bug that would lead to a NoSuchElementException when a pool name no longer mapped to a location known to the Resilience service. This issue has been fixed.
When multiple pools go offline it is possible that all replicas for a given resilient file become unreadable. If the file is not CUSTODIAL, and thus cannot be restored from tape, the discovery of such a file during scanning will generate an error in the ‘history errors’ listing, in the resilience domain .resilience log, and will also raise a general alarm concerning the pool.
There currently exists a command, ‘inaccessible’, which generates a listing of the pnfsids on a given pool which in the current state of dCache have no readable replicas. However, this command takes a while to complete (asynchronously), and the output is written to a file which must be viewed by logging in.
This release introduces ‘refering pool’ information to the error output so that grepping the resilience log for a given pool becomes easier, and adds options to the command to check further details.
scripts
A regression in the dcache pool convert command was fixed; the command works again.
scriptsi
The instructions that are printed out once dcache pool convert completes successfully now correctly point to the property that needs to be updated, namely pool.plugins.meta.
Changelog 4.2.0..4.2.1
- 84cef8a4fe
- [maven-release-plugin] prepare release 4.2.1
- a755481da7
- dcache-resilience: improve inaccessible file accounting
- f399ca1f5c
- dcache-resilience: skip invalid cancel filters
- cd8d67ec4a
- pool: fix ‘dcache pool convert’ command
- ecb03fa5c3
- scripts: update reference to configuration property
- 1664ca4f10
- pool: fix metadata migration tool to use Path
- a2caabae35
- ftp: always close proxied data connection if client closes their half
- cc96591b64
- vehicles: fail-fast on invalid path
- 7bcb0d6b58
- pool: retry request to pnfs manager if timed out
- b45f1982da
- dcache-frontend: invalidate transfer when killed mover not found
- 7526861552
- [maven-release-plugin] prepare for next development iteration
- 20539f937e
- dcache-frontend: add “Requires admin role” to alarms methods (Swagger)
Release 4.2.0
Admin
The admin sshd server supports public key and password authentication mechanisms. Some facilities, however, have security policies that explicitly disallow these mechanisms. This release adds the kerberos authentication mechanism to the admin sshd server and introduces the ability to enable specific authentication mechanism(s) to conform to different security policies. The following configuration variables have been added to control the behavior:
admin.ssh.authn.enabled = kerberos,password,publickey
(password and publickey are enabled by default). If kerberos authentication is chosen, there is a second configuration variable pointing at the location of the keytab file on your system:
admin.ssh.authn.kerberos.keytab-file = /etc/krb5.keytab
Frontend
The JSON returned from the RESTful admin services now sorts lists (especially things like pool names) in natural ascending order for improved readability. Swagger annotations for these services have been improved and regularized. If the history service is not running, the absence of data is now handled peacefully. Finally, when a mover is killed via the frontend, but the transfer has already completed, the cached entry is now correctly invalidated; an intervening request for the mover listings from the frontend should display any killed mover’s state as “CANCELED” until it is in fact removed from the cache during the ensuing refresh (to get this correct behavior, however, the pools must also be updated to 4.1.9 or better).
As mentioned above, this release includes a version of dcache-view (1.4.2) which implements the full set of admin pages.
Events
With this release, frontend includes a pluggable framework to support events, plus an example plugin: metronome.
The framework uses Server-Sent Events (SSE) to deliver events. It also includes a management framework for creating the SSE endpoint, discovering which kinds of events are supported, a managing which events are of interest.
A typical client first creates a channel. This is the SSE endpoint through which the client will receive information about events. Channels are user-specific and require authentication. Clients should not share channels: a connected client is disconnected if another client connects to this channel.
Each event has some basic metadata, such as an event type, and some event data. An event type is a label that describes broadly similar events. All event data for events from the same event type have the same schema, which may be queried.
Events from the SYSTEM event type are sent to all clients, but other events are only sent if a channel is subscribed to those events. A subscription is when the client expresses an interest in some events being delivered to a channel. Each subscription has a single selector which acts as a filter, describing which events should be delivered. The format of a selector depends on the event source and the schema for each event source is available.
The metronome plugin allows a client to receive a regular stream of simple text event.
Here is a simple illustration of this working. In this example, an X.509 credential is used. Note that any supported authentication may be used instead.
First, requesting a channel:
paul@celebrimbor:~$ curl -D- -E /tmp/x509up_u1000 -X POST https://prometheus.desy.de:3880/api/v1/events/channels/
HTTP/1.1 201 Created
Date: Tue, 26 Jun 2018 16:35:42 GMT
Server: dCache/4.3.0-SNAPSHOT
Location: https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, DELETE, PUT
Access-Control-Allow-Headers: Content-Type, Authorization, Suppress-WWW-Authenticate
Content-Length: 0
paul@celebrimbor:~$
The Location response header contains the channel. The channel allows the client to receive events:
paul@celebrimbor:~$ curl -E /tmp/x509up_u1000 -H 'Accept: text/event-stream' https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw
This command does not return, but receives events according to the Server-Sent Events (SSE) standard. For more details, see https://www.w3.org/TR/eventsource/
Using a seperate terminal, we can create a subscription to start receiving events:
paul@celebrimbor:~$ curl -D- -E /tmp/x509up_u1000 -X POST -H 'Content-Type: application/json' -d '{"delay": 2, "message": "event ${count}"}' https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome
HTTP/1.1 201 Created
Date: Tue, 26 Jun 2018 16:41:17 GMT
Server: dCache/4.3.0-SNAPSHOT
Location: https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, DELETE, PUT
Access-Control-Allow-Headers: Content-Type, Authorization, Suppress-WWW-Authenticate
Content-Length: 0
paul@celebrimbor:~$
The JSON entity sent with the POST request is the selector for this
subscription and the URL targets the event-source: metronome
.
The client listening for events will now begin to receive events:
paul@celebrimbor:~$ curl -E /tmp/x509up_u1000 -H 'Accept: text/event-stream' https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw
event: metronome
id: 0
data: {"event":"event 1","subscription":"https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5"}
event: metronome
id: 1
data: {"event":"event 2","subscription":"https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5"}
event: metronome
id: 2
data: {"event":"event 3","subscription":"https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5"}
event: metronome
id: 3
data: {"event":"event 4","subscription":"https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5"}
The subscription may be deleted if these events are no longer of interest.
paul@celebrimbor:~$ curl -D- -E /tmp/x509up_u1000 -X DELETE https://prometheus.desy.de:3880/api/v1/events/channels/AgW3_cnbrAQ8a0vjfZSwOw/subscriptions/metronome/b969ccbf-2ba0-4f93-b249-3ce9554fb3f5
HTTP/1.1 204 No Content
Date: Tue, 26 Jun 2018 16:41:51 GMT
Server: dCache/4.3.0-SNAPSHOT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, DELETE, PUT
Access-Control-Allow-Headers: Content-Type, Authorization, Suppress-WWW-Authenticate
paul@celebrimbor:~$
After this, the channel will stop receiving metronome events.
A channel may have many concurrent subscriptions, even different subscriptions from the same event source. It is possible for a client to discover which channels have been requested, and which subscriptions have been made per channel.
One further point to note: a channel that does not have a client connected for an extended period will be deleted automatically.
FTP
The GridFTP door (by default) tries to establish data connections directly between the client site and the pool. Ultimately, this is under the clients control: the client can issue commands that force dCache to proxy data connections.
The MODE-E extension allows data to flow over multiple TCP connections and for those connections to be reused across multiple transfers. This makes it harder to know the current status of a FTP session’s proxy.
To help alleviated this problem, the info
admin command now shows an
ASCII-art rendition of the current data transfers. The door’s proxy
is represented by a box, showing the door’s client-facing and
pool-facing IP address and port numbers.
The corresponding remote IP address and port numbers for the pool and the client are also shown, along with the current status of the TCP connections using the following scheme:
Symbol | Meaning |
---|---|
======== |
Establish |
--<--<-- |
Half-closed (data flowing to the left) |
-->-->-- |
Half-closed (data flowing to the right) |
........ |
Closed |
Here is an example, showing the status half-way through transferring a
file with globus-url-copy
using the options -no-g2 -p 5
[prometheus] (GFTP-prometheus-AAVvh9jceUg@dCacheDomain) admin > info
User Host : 131.169.214.58
User : paul
[...]
Proxy status:
Client +---------------------Adapter---------------------+ Pool
131.169.214.58:39233========| 131.169.5.149:21218 131.169.5.149:37083 |========131.169.5.149:40000
131.169.214.58:39237========| 131.169.5.149:21218 |
131.169.214.58:39236========| 131.169.5.149:21218 |
131.169.214.58:39234========| 131.169.5.149:21218 |
131.169.214.58:39235========| 131.169.5.149:21218 |
+-------------------------------------------------+
TCP states: "========" means Established
GFTP-prometheus-AAVvh9jceUg@dCacheDoma[...]Receiving;12982;
[prometheus] (GFTP-prometheus-AAVvh9jceUg@dCacheDomain) admin >
NFS
For situations where transfers stay in pending state in the nfs door, two new admin commands are added that allow telling doors to ‘forget’ the transfer or retry it:
transfer retry
transfer forget
Such situations can be result of network errors or when cell communicatioin tunnels get restarted due to core domain restart.
Some deployments may have quite a busy export nfs file. To make it more manageable, starting with
release 4.2, dCache supports reading multiple export files from an exports.d directory.
The configuration property nfs.export.dir
controls the location of the
directory. By default, /etc/exports.d is used. Only files ending in .exports are considered.
Files beginning with a dot are considered hidden and ignored.
PNFS Manager
The internal pnfsid <=> inumber cache can be configured to run in the cluster mode, e.g. multiple PnfsManagers can share the same cache. However, as this functionality is not required by current release cluster configuration is not enabled.
Poolmanager
The psu remove pool
command now supports globbing, i.e. wildcard
characters can be used to refer to a set of pools to act upon:
psu remove pool dcache-pool-A*
Resilience
Three issues of some importance have been addressed.
First, Resilience no longer will (incorrectly) report on non-resilient files which have been corrupted.
Second, the computation of the number of copies needed when a storage requirement changes was fixed so that it is now possible to go from N to M and have resilience react correctly (by removing all but M; in particular, the case where M = 1 was not being handled at all).
Finally, in connection with the latter, the default ‘required’ value on storage units has been changed from 1 to undefined. The only resulting behavioral change is that resilience will now completely ignore files with the default storage unit tag rather than alarming when no copies are readable. ‘-required=1’ is different in that resilience will instead raise an alarm if the single copy is not accessible. This change creates a clearer distinction between resilient and non-resilient files.
WebDAV
Milton’s behaviour in handling http OPTION method now correspond with Apache httpd server. This ensure a proper response especially when OPTION request targets an entity that does not exist - like in the case of uploading a file through a browser client like dcache-view.
dCache WebDAV now support RFC 3230 clients that request digests using multiple HTTP headers. Hence, dCache now accept and process multiple ‘Want-Digest’ headers.
XRootD
Since 4.1.0 Apache-Kafka has been introduced as a possibility to publish Billing messages to Kafka. When Kafka enabled these events could be consumed outside from dCache. Similarly to 4.1.0 in 4.2.0 Kafka could be enabled for xrootD by setting the folowing properties:
xrootD.enable.kafka
controls whether the Kafka messaging system is used for message deliver.
xrootD.enable.kafka = true|false
xrootD.kafka.bootstrap-servers= host1:port,host2:port
.
Changelog from 4.1.0 to 4.2.0
- 7f02ae6bee
- [maven-release-plugin] prepare branch 4.2
- ee8f1148ce
- no longer disable Java ECC algorithms
- 2cad93e4ed
- webdav: do not return 404 for OPTIONS request targeting absent entity
- e4b9f72b38
- dcache: release dcache-view version 1.4.2
- e90398f905
- core: support MIME discovery from more file extensions
- 89f105964f
- scripts: fix port listing for httpd service
- ebb8633498
- docs: add dcache-bird into project source tree
- e2b5adc2fd
- libs: update nfs4j–0.17.2
- 0958835ac4
- frontend: use ThreadLocal to record user identity information
- cf39525a1a
- pom: update to nfs4j–0.17.1
- 0a14ac7756
- skel: replaced dead links to antlr documentation
- df6498e339
- nfs: add support for exports.d directory
- b08d2e4fb0
- libs: update to nfs4j–0.17.x
- 0038b882e1
- dcache: setter chanegd to public
- 7e1e56b08e
- pool: record events when writing/deleting files
- db7d9a07de
- xrootd: integrate Kafka to xrootdDoor
- da7f2f46ff
- pool: fix logging of replica-store on start-up
- af60a09f96
- hazelcast: don’t phone home
- 212a8165d5
- frontend: update and refactor the metronome plugin
- 5c389dad6d
- frontend: send notification to identify cause of disconnect
- eb202e3a49
- frontend: avoid deadlock
- f230aa4c75
- frontend: add subscription change notification
- b1fc60ba01
- frontend: avoid resource leak when loading JSON
- ece1ce1576
- frontend: drop support for SSE keep-alive
- 1c962917d5
- frontend: disable support for compressed output
- eaaa430dec
- nfs: create write movers for NEW file only
- 822c6f19f3
- libs: update Zookeeper to version 3.4.12
- 6cbe61f37f
- frontend: add generic support for events
- 510b5222ac
- core: update NameSpaceProvider#getParentOf to include name of target in parent
- e47c9abd71
- chimera: extend getParentOf to include the name of the entry
- b3f90f2508
- pool: disallow enabling pool while its repository is loading
- 9161b20422
- webdav: fix java.lang.ClassCastException: error
- 2781a1624a
- dcache: release dcache-view version 1.4.1
- 3bbe49e032
- docker: Add a way to create docker image
- 5bf34c5a7e
- chimera: ensure that we call JdbcFs#close to shutdown hazelcast
- f17bafb07b
- pool: fix handling file-not-found in migration module
- a547e03625
- chimera: use hazelcast cache for pnfs id to inumber mapping
- ad4f22bbb4
- webdav: add kafka to webdavdoor
- 8282a79424
- kafka: add common LoginPrudicerListener for Kafka integration
- a48462f253
- scripts: add support for parsing ZooKeeper transaction logs
- 5c96cc5bf7
- ftp: log failures to wrap/encrypt responses
- 1bb68a8ade
- general: remove unnecessary getCharset calls
- 2ff49b600c
- poolmanager: encapsulate assumptions into Pool
- 7487d0a90a
- nfs: fix invalid types in equals (regression in a11ec8d76b)
- 7a287eed31
- nfs: don’t use org.dcache.utils.Bytes#fromHexString
- 249ae6c90a
- dcache-resilience: restore accidentally removed initialization of final field
- 26b0c09b61
- dcache-resilience: eliminate unnecessary SelectionAction enum*
- 4f2bd5f714
- dcache-resilience: eliminate dead/unused code
- f57d7f7cf1
- dcache-resilience: modify storage unit resilience requirement definition
- 410ed2a744
- dcache-resilience: fix computation of operation count when storage requirements change
- b7c068720a
- webdav, pool: handle https redirects
- 0ac1251716
- webdav: log errors if OIDC delegation fails
- f5c48ef709
- vehicles: use PnfsGetFileAttributes when resolving a path to PnfsId
- 7822fd9165
- ftp: returned error is too vague for meaningful investigation
- 549e54b36a
- core: fix NPE in Transfer#getIoDoorEntry
- f0fe914599
- psu: replace Map#get+Map#put with Map#putIfAbsent
- d3115af6e4
- poolmanager: convert Unit type into enum
- aec4ee7657
- nfs: update NFSTransfer to return an array of mirror servers
- efd60b5e96
- frontend: add swagger Tag descriptions
- bf2b3c38ae
- nfs: add commands to reactivate stale transfers
- 2b56c0f782
- fix broken commit
- 4e135acdad
- test: fix broken commit a11ec8d76b
- 72f0f8070d
- vehicles: remove PnfsSetChecksumMessage message
- a11ec8d76b
- vehicles: introduce Pool class to glue pool name and address
- 887d8d5738
- poolmanager: use Optional to store pool selection on stage retry
- 39d6a6e1cd
- poolmanager: fix migration command if named pool is removed
- e688302c92
- dcache-resilience: repair over-aggressive handling of broken file messages
- 195733d197
- ftp: improve proxy logging to facilitate debugging
- 74c3fe225c
- pool: fix error message for failed active FTP transfers
- 2a2fd1196e
- cells: remove extra synchronized block in CellMessageDispatcher#findReceivers
- 3da722c9ee
- gplazma-fermi: fix last modified check in junit test
- 0d52220359
- pool: fix data handling to use boolean rather than an enum masquerade
- b87180235f
- spacemanager: add remote pool monitor debug logging
- 1efab08893
- ftp: avoid NPE if connection is closed.
- d7c82153e8
- ftp: add detailed information about proxy status
- 23d68c20e8
- core: don’t use Subject#toString in IoDoorEntry
- 3cebeace5f
- pom: enable deprecation warning at compile time
- c17d74d9cd
- libs: update guava and slf4j dependencies
- 2c5b2c515b
- gplazma-fermi: add mapping plugin to support VO group and username from file
- b18b4775ed
- nfs: filter out IPv6 DS addresses if client connected with v4
- 1f977edebd
- chimera: adjust postgres driver provider to new version schema
- 9ea80c230a
- dcache-webadmin: remove
- 64300b79f6
- pom: use nfs4j–0.16.1
- 6e25ce6d9a
- skel: remove extraneous cell-info dir
- 7385ae03cf
- packaging: add missing chown and chmod on pool-history
- fb4b738683
- skel: make deprecated alarms properties forbidden
- ad8a9ad724
- dcache-frontend: revise Swagger annotations
- 10160be4cd
- dcache: release dcache-view version 1.4.0
- 9f68dbaeef
- webdav: support multiple RFC 3230 ‘Want-Digest’ headers
- 9280e4d658
- ftp: ensure half-closed connections are fully closed on return
- c7af0ada5d
- ftp: close pool connection with MODE-E proxy
- a379c4ca7b
- psu: add glob support for ‘psu remove pool’ command
- 17f2e54425
- dcache-frontend: avoid null dereferencing for incomplete pool history data
- f9e4fbef43
- dcache: avoid NPE from initialization race in RestoreRequestsReceiver in HttpPooMgrEngineV3
- 22aa094bcc
- [maven-release-plugin] prepare for next development iteration