Highlights

  • NFS read performance improvement
  • Kafka logging for billing
  • Introduce hot file replication mechanism on pools, where the pool itself triggers replication of hot (popular) files

Incompatibilities

  • Xrootd PrepareRequest will return Unsupported.
  • Frontend and webdav doors enforce request limiters by default.

Acknowledgments

We want to thank Shawn McKee for his contributions.

Release 11.2.4

alarm

It has been reported that alarm server is vulnerable to RCE attack due to unprotected object deserialization. This is fixed now.

bulk

dCache now replies w/ 413 Entity Too Large code when various bulk request limits are exceeded.

chimera

File creation with non-basic attributes is not an atomic operation, as the provided namespace may implement it as create+set attr. If the initial file attributes don’t allow the set-attribute operation, the create request will fail.

This is now fixed, and the initial, non-basic attributes can be applied on create.

If a directory tag is created without any content, ResultSet#getBinaryStream returns null, which will trigger an NPE.

This is now fixed.

gplazma

READ_METADATA operations are authorized for requestors without read access if the requestor provides a token with the storage.poll claim.

pool

It is desirable to have a single admin command to display the values of both hot-file migration parameters (replicas and threshold).

It is also desirable to see quickly whether the hot file migration facility is enabled on a given pool.

Now there are have been added two ne commands:

The addition of a pool command hotfile show (replicas=3 threshold=5) to MigrationModule. The hot file replication enablement status has been added to the pool section of info -a.

Firefly onStart marker for MoverProtocol-based transfers has been added.

While only tests, these files use an obsolete property name pool.hotfile.monitoring.enable, and could cause/perpetuate confusion and errors in user configuration.

pool.hotfile.monitoring.enable changed to pool.hotfile.replication.enable in all cases.

rpm

dCache can run with java 17 as well as java 21. However, if a side decided to run dcache with java21, the RPM will pull java17 as dependency anyway.

If java is not installed, rpm will pull java17. If java21 is already installed, then no extra java packages will be pulled.

tape

While testing storage.poll scope it was observed that querying file locality imposes requirement of DOWNLOAD activity to be available in the list of allowed activities in OIDC scope claim. This is unnecessary and complicates handling of storage.poll.

DOWNLOAD activity is no longer required to query file locality. This allows implementation of storage.poll as READ_METADATA activity

Changelog 11.2.3..11.2.4

72eeeae7c7
[maven-release-plugin] prepare release 11.2.4
d14e394698
chimera: applying attributes to newly created file should skip permission check
1c7b7adf5a
ci: use custom k8s deployment for minio
451c5a3815
ci: skip git clone it sources are not needed
9b4b56f48a
chimera: fix reading of empty tags
cdc2a07854
bulk: report 413 Entity Too Large is limits on bulk requests are exceeded
0b6b21c5f8
gplazma2-oidc: Add and test support for storage.poll WLCG claim
3474d2e7a7
tape rest API: remove AccessMask.READ_DATA requirement when querying for file attributes for file locality
423e79a774
alarms: use HardenedLoggingEventInputStream to address possible RCE when deseriaizing log messages
5877bb383c
scripts: fix double rw definition in FIO benchmark script
db2a697322
pool: New hotfile show command, and show enablement status with info -a (#8067)
f7563db3c5
pool: emit firefly onStart marker from RemoteHttpDataTransferProtocol
07a898d8c1
libs: update apache-curator to 5.9.0
faec79f06a
pool: Improve logging for hot file replication
4c74631875
pool: Fix properties tests to use the correct property
cec1e72438
build(deps): bump io.netty:netty-codec-http
593be28945
rpm: make package require java17 or java21
6f18dc1158
[maven-release-plugin] prepare for next development iteration

Release 11.2.3

firefly

The current release fixes severel issues for firefly including SciTag WAN TPC marker generation and validation.

pool

Hot file migration destination pools are now prioritized, and match network and protocol requirements of the file request that triggered the migration.

Changelog 11.2.2..11.2.3

63491dcc6c
[maven-release-plugin] prepare release 11.2.3
c5cfdc1525
pool: enhance migration pool choice logic
77426b981d
pool: fix failing unit tests
ec4c2489c5
firefly: address latest PR 8044 inline feedback
a660c4eb20
firefly: address PR 8044 review fixes
2eca0648f7
xrootd: fix logSciTagsRequest called before cell address initialization
4aea8bf9a7
Fix SciTag WAN TPC marker generation and validation
2f1d03c42b
pool: exclude p2p requests from numberOfRequestsFor count
5a227a2fbf
[maven-release-plugin] prepare for next development iteration

Release 11.2.2

common

This PR updates firefly sending semantics to use existing configuration variables while preserving default behavior.

pool.enable.firefly=true: continue sending fireflies to the transfer flow destination (UDP 10514). pool.firefly.destination unset: no extra copy is sent. pool.firefly.destination set: send one additional copy to that destination.

The configured pool.firefly.destination accepts:

`host (defaults to UDP 10514)`
`host:port (explicit port override)`

WAN SciTag support for remote HTTP(S) TPC movers has been added.

namespace

Many parts in dcache expected that files access latency and retention policy are always defined. The namespace fallbacks to default values, if there are no explicit tags are defined. However, there is shortcat tha skips default values if WriteToken tag is specified. The lately introduces Archivemetadata is a functionality that directry affected by this behaviour:

java.lang.IllegalStateException: Attribute is not defined: RETENTION_POLICY.

This is now fixed.

nfs4j

nfs4j has been upgraded to 0.27.2 with server-side-copy now handling unsupported case of async copy.

pool

Often admins want to know the expected I/O rates that a pool can provide. The desired benchmarks can be hart to configure. This, it makes sense to provide a ‘good starting point’ with dcache.

Now we have the command dcache pool benchmark for simple test to benchmark pools filesystem

Changelog 11.2.1..11.2.2

362231d363
[maven-release-plugin] prepare release 11.2.2
f6b018450c
script: add dcache pool benchmark command
47ab4da4e4
Add WAN SciTag support for remote HTTP(S) TPC movers
376b2104bd
Merge pull request #8039 from ShawnMcKee/fix/firefly-sci-tag-boundary-fqan-fallback
9f4b114ed6
Firefly: address PR review comments
3eaec9e29a
Firefly: send additional copy to configured collector
65a61e394c
pom: update nfs4j to 0.27.2
e6b2b4bbfb
namespace: ensure that access latency and retention policy are always defined
0b3f22fb52
build(deps): bump org.apache.zookeeper:zookeeper from 3.8.4 to 3.8.6
f845197194
[maven-release-plugin] prepare for next development iteration

Release 11.2.2

Changelog 11.2.1..11.2.2

362231d363
[maven-release-plugin] prepare release 11.2.2
f6b018450c
script: add dcache pool benchmark command
47ab4da4e4
Add WAN SciTag support for remote HTTP(S) TPC movers
376b2104bd
Merge pull request #8039 from ShawnMcKee/fix/firefly-sci-tag-boundary-fqan-fallback
9f4b114ed6
Firefly: address PR review comments
3eaec9e29a
Firefly: send additional copy to configured collector
65a61e394c
pom: update nfs4j to 0.27.2
e6b2b4bbfb
namespace: ensure that access latency and retention policy are always defined
0b3f22fb52
build(deps): bump org.apache.zookeeper:zookeeper from 3.8.4 to 3.8.6
f845197194
[maven-release-plugin] prepare for next development iteration

Release 11.2.1

pool

No fireflies for webdav is now fixed. The TransferLifeCycle accepts now https as protocol that needs fireflys.

restapi

With upgrade of jackson dependencies for 11.x handling of Optional datatype started to fail like so:

“Unable to interpret JSON: Java 8 optional type java.util.Optional<java.lang.Double>…”

when calling rest api to query pool information. This is now fixed.

Changelog 11.2.0..11.2.1

3c04cd263d
[maven-release-plugin] prepare release 11.2.1
175f39bf9a
dcache cli: fix issue with printing pool size when max diskspace is set in percentage
f321f1eeb4
http/tpc: add support for ArchiveMetadata header
e80671c94a
rest api: add support for Java8 Optional data type
7100800aa4
pool: send firefly for https transfers
08e32fb45d
[maven-release-plugin] prepare for next development iteration

Release 11.2.0

Billing

Starting with version 11.2.0 dCache adds transferTag information to the generated JSON records. The transferTag is used by clients to label transfers with activity-specific information, such as experiment ID and workflow ID. This information is currently provided by XRootD and HTTP clients, and can be used by monitoring systems to identify data flows.

A Kafka producer operates asynchronously by default: when an application calls send(), the record is placed into an internal buffer and the call typically returns immediately, without waiting for broker acknowledgements.

However, asynchronous operation does not guarantee fully non-blocking behavior. Under certain conditions—such as broker unavailability, missing topic metadata, or exhausted producer buffers—the producer may block or fail synchronously. These edge cases are a common source of confusion and can lead to unexpected application failures.

In dCache, this behavior poses a risk when Kafka is unavailable, as transfer events may not be marked as successful if Kafka reporting fails, potentially causing service instability or crashes. To mitigate this risk, the Kafka logging has been relocated to the non-blocking Billing service and disabled in other dCache components that may block on Kafka availability. As a result, Kafka cluster outages no longer impact the stability of dCache processes.

To enable kafka for billing service the folowing property in billing.properties should be enabled (one-of?true|false|${dcache.billing.enable.kafka})billing.enable.kafka.

Frontend / Webdav

A request-rate-limiting mechanism has been introduced for the Jetty-based WebDAV and Frontend doors to improve resilience against misbehaving or abusive clients. Previously, the system assumed well-behaved clients, which could allow even a single client to overwhelm the service. The new implementation adds a Jetty handler layer that tracks request outcomes and enforces both global and per-client rate limits using configurable thresholds backed by in-memory caches and rate limiting controls. Clients exceeding these limits receive HTTP 429 responses and may be temporarily blocked. Administrators can reset blocked clients via new admin commands. Several configuration properties have been added to control request rates, error thresholds, blocking windows, and limits on blocked clients, providing flexible protection against denial-of-service scenarios.

The request rate limiter is controlled through a set of configuration properties that define how aggressively clients are throttled or blocked, both globally and individually:

webdav.limits.max-blocked-clients Sets the maximum number of clients that can be tracked as blocked at any given time. This prevents unbounded memory usage if many clients are misbehaving simultaneously.

Global rate limiting

webdav.limits.rate.overall

Defines the total request rate allowed across all clients combined. This acts as a system-wide throttle to protect the service under heavy load.

Per-client rate limiting

webdav.limits.rate.per-client.fractions

Specifies how much of the global rate each individual client is allowed to consume (typically as a fraction of the overall rate).

webdav.limits.rate.per-client.block.window.time webdav.limits.rate.per-client.block.window.time.units

Define how long a client remains blocked after exceeding its rate limit.

Error-based blocking

webdav.limits.error.max-allowed The maximum number of failed or problematic requests a client may generate within a time window before being blocked.

webdav.limits.error.block.window.time webdav.limits.error.block.window.time.units

Define the duration of the observation window for counting errors, and how long a client is blocked once the threshold is exceeded.

Blocked client management

webdav.limits.blocked-clients.idle-time webdav.limits.blocked-clients.idle-time.units

Control how long a blocked client remains in the blocked list without activity before being automatically removed.

The same set of properties applies to the frontend door, with the prefix frontend.

NFS

The NFS door has been updated to use the original NFS request credentials when Proxy-IO is used.

Pool

The NFS mover has been updated to use zero-copy to reduce memory copies when data is sent to a client during READ. This gives up to a 20% increase in data throughput for data-intensive applications.

Introduce hot file replication mechanism on pools, where the pool itself tiggers replication of hot (popular) files. When the number of in-flight transfers for a file reaches certain configurable threshold, the pool invokes the migration module to create a pre-defined configurable number of replicas on other pools in the same pool group. Added hot file replication feature to automatically detect and replicate heavily-requested files to other available pools in the same group. Added hotfile commands to the pool admin interface to enable customization of parameters affecting hot file replication. New boolean configuration file property pool.hotfile.replication.enable (set true to enable hot file replication).

Xrootd

The Xrootd protocol provides a mechanism to trigger file stage via prepare request. Up to version 11.2.0 dCache responded with OK to Xrootd prepare request which, depending on the client application logic, may enable subsequent open calls on files that are not availble in cache, potentially leading to blocking the client application due to large tape access latencies and cause inefficient access to tape resource. This behavior is now changed, so that the prepare request will fail with the error Unsupported. Users who need staging should use the TAPE-REST-API.

Changelog from 11.1.0 to 11.2.0

f6e47327ab
[maven-release-plugin] prepare branch @{releaseLabel}
b21c65af3e
bulk: catch another place generating verbose logging
a9f5e776f1
pool: implememt hot file replication
924ef9354d
code coverage: bump JaCoCo version to 0.8.14
9a010bf35a
Update JsonWebToken unit tests for ECDSA signatures
47bcf81c03
packages: fix gplazma2-python plugin version
10396b94c0
ci: enable kafka in billing
ad03a7caaf
Revert “fix broken test due to kafka move to billing”; fix the failing test
7e028b08d9
libs: update maven-compiler-plugin version to 3.10.1
fd98eb1850
fix broken test due to kafka move to billing
4be74060f8
billing: migrate Kafka logging to Billing service
01ace93a31
frontend: validate JSON in migration request
766f7b36bd
pom: bump jackson to 2.18.5
31eff98dfb
pom: bump netty version to 4.2.9
e8d25383b7
ci: escape strings in frontend test environment
466b32883f
bulk: avoid spamming log file when requests have already been cleared due to auto-clear flag
7a5b413ef4
nfs: update show pools command to display pure IPs
f77a00aea8
Minor AI-suggested improvements to Maven config
fb5f925357
Revert “Minor AI-suggested improvements to Maven config”
07532e159e
Minor AI-suggested improvements to Maven config
6256b08b95
Prevent unwanted spinning on migration jobs
3d91759804
billing: add transfer tag to moverInfo json
297a7ba0b6
gplazma: update explain login to allow credentials
a9871213ba
common: increase robustness of access log file
9f0a67d72e
build(deps): bump io.netty:netty-codec-http
e800889414
docs: replace dCache components diagram
c7cbf8eede
webdav: fix range header formatting in relay request
4ba5e32039
pool: use zero-copy FileChunk to perform NFS reads
4191e1c0cc
libs: update swagger-jersey2 version to 1.6.14
99ada5032a
libs: update swagger version to 5.30.2
a0c481b062
frontend: Fix issue 7968
1719c0b59e
Address PR #7785 review comments: fix DER encoding for ECDSA signatures and add unit tests
fe7be7478c
ci: add dcache db migration test
14a743ef4f
srm,spacemanager: fix liquibase migration checksum mismatch
873ce0cc68
fix the propert va;ue for frontend limits error max-allowed
8de5c6f185
common-security: add workaround bouncycastle EC algorithm name mismatch
ec0c009b32
frontend: add code flow auth to dcache-view
912691890e
gplazma: fix banfile plugin
426a95039d
webdav: return 403 on unauthorized attempt to rename resource
c483a3ce65
pool: use SSLTrustManagerWithHostnameChecking to initialize CAnL
430b1cec17
ci: stop using voms server
defc79cd40
libs: update jersey version to 2.47
76fbde2575
nfs: use user credentials to perform proxy-I/O
549237caf6
pool: add command line to check the correctness of online replica having sticky bit
6c1074f5c4
transfermanagers: complete removal of DB support
88ac1d0cea
Reapply “libs: update jackson to version 2.18.3”
48d3b549ff
gplazma: ldap make search base DNs more flexible
48df33385f
ci: use java21 build images
3560e86630
qos: frontend tests for qos transition
d5d76b4a2b
pool: merge MoverFactory interface into TransferService
831145ae2c
pool: fix mover’s local endpoint reported by a mover
fc3faf12ef
qos: frontend tests for qos transition
7819d3a980
book: document ‘role’ configuration option of omnisession plugin
e7d2304abf
src: use Java17 HexFormat instead of guava alternative
a501b3da33
xroot: don’t reply with OK on PrepareRequest
96a88e6efe
ci: add qos policy endpoint tests to the gitlab pipeline
6ca0a2ed58
common: remove unused constructor
dcdf855975
ci: fix path to custom helm config
6f3b040aa7
ci: customize helm for CI
e2f575e2af
build(deps): bump org.bouncycastle:bcpkix-jdk18on from 1.78.1 to 1.79
c4139c7e7f
ci: update assertion message for all frontend tests
635c15cbd5
dcache: enforce rate limiting only for authentication errors
2c5e92cbc8
docs: fix references to old ‘cleaner’ service
f3aab0de2b
pool: log at warn when files state in pool change
dd5ae2d7d8
utils: update ExceptionsTests to correctly handle java 19+
341c77e65f
pnfsmanager: don’t use guava’s ListenableFuture to notify flush listeners
7b348be0df
pom: update test dependencies
cb8cb0924e
ci: rewriting qos-policy endpoint tests
c49820e441
gplazma: enforce RolePrincipal if Role attribute is set
fe4d1896ed
nfs: lower nfs-client workaround message
2e630594af
ci: use chainguard-dev version of kaniko
d68c5cfd40
test: drop snicked system.out.print
cd23f595bd
poolmanager: fix wrandom partition
a25afab99e
build: fix missing resources caused by specification of resources
eef720e567
dcache: fix use of vulnerable deserialization methods in test
8e7c5b63eb
frontend: introduce request rate limter to frontend
fbeba80b26
build: remove vestigial SMC configuration from pom.xml files
5d54e52624
build: Update SMC library and corresponding maven plugin
39c7320a0b
build : fix hysteresis with system-test
c74e8e4d39
build: Update surefire and configure to redirect test output
b2344ec153
pool: extract comparator creation to function in IoQueueManager
8a75cc1749
Revert “Minor typo in comment”
1bd3809d12
Revert “IDE-suggested drive-by: extract comparator creation”
c25a037df2
IDE-suggested drive-by: extract comparator creation
4b6410b438
Minor typo in comment
3f6d72a1f4
jetty: introduce request rate limiter
f8ff33abbf
omnisessio: accept tabs in config file
c08a1bdfa7
BUILDING.md: update how start the container
4030e00e8d
[maven-release-plugin] prepare for next development iteration
4b771ef3f9
Update JsonWebToken.java
5155d12c7d
Update JsonWebToken.java
6e5c91b36e
Update JsonWebToken.java
02e99fc200
Update JsonWebToken.java
2d22189329
Update JsonWebToken.java
0992f8bda5
add support for EC Public Keys