When writing data into the dCache, and possibly later on into an HSM, checksums may be calculated at different points within this chain.
- Client Checksum
The client calculates the checksum before or while the data is sent to the dCache. The checksum value, depending on when it has been calculated, may sent together with the open request to the door and stored into
pnfs
before the data transfer begins or it may be sent with the close operation after the data has been transferred.The
dCap
protocol providing both methods, but thedCap
clients use the latter by default.The
FTP
protocol does not provide a mechanism to send a checksum. Nevertheless, someFTP
clients can (mis-)use the “site
” command to send the checksum prior to the actual data transfer.- Transfer Checksum
While data is coming in, the server data mover may calculate the checksum on the fly.
- Server File Checksum
After all the file data has been received by the dCache server and the file has been fully written to disk, the server may calculate the checksum, based on the disk file.
The graph below sketches the different schemes for dCap
and
FTP
with and without client checksum calculation:
Table 20.1. Checksum calculation flow
Step | FTP (w/o initial CRC) | FTP (with initial CRC) | dCap |
---|---|---|---|
1 | Create Entry | ||
2 | Store Client CRC in pnfs | ||
3 | Server calculates transfer CRC | ||
4 | Get Client CRC from pnfs | Get Client CRC from mover | |
5 | Compare Client and Server CRC | ||
6 | Store transfer CRC in pnfs | Store client CRC in pnfs | |
7 | Server calculates disk file CRC |
As far as the server data mover is concerned, only the
Client Checksum and the
Transfer Checksum are of interrest. While
the client checksum is just delivered to the server mover as
part of the protocol (e.g. close operation for dCap
), the
transfer checksum has to be calcalated by the server mover on
the fly. In order to communicate the different checksums to
the embedding pool, the server mover has to implement the
ChecksumMover interface in addition to
the MoverProtocol Interface. A mover, not
implementing the MoverProtocol is assumed
not to handle checksums at all. The Disk File
Checksum is calculated independedly of the mover
within the pool itself.
public interface ChecksumMover { public void setDigest( Checksum transferChecksum ) ; public Checksum getClientChecksum() ; public Checksum getTransferChecksum() ; }
The pool will or will not call the setDigest method to advise the mover which checksum algorithm to use. If setDigest is not called, the mover is not assumed to calculate the Transfer Checksum.
java.security.MessageDigest transferDigest = transferChecksum.getMessageDigest() ; *** while( ... ){ rc = read( buffer , 0 , buffer.length ) ; *** transferDigest.update( buffer , 0 , rc ) ; }
getClientChecksum and getTransferChecksum are called by the pool after the MoverProtocols runIO method has been successfully processed. These routines should return null if the corresponding checksum could not be determined for whatever reason.
public void setDigest( Checksum transferChecksum ){ this.transferChecksum = transferChecksum ; } public Checksum getClientChecksum(){ return clientChecksumString == null ? null : Checksum( clientChecksumString ) ; } public Checksum getTransferChecksum(){ return transferChecksum ; }
The DCapProtocol_3_nio mover implements the ChecksumMover interface and is able to report the Client Checksum and the Transfer Checksum to the pool. To enable the DCapProtocol_3_nio Mover to calculate the Transfer Checksum, either the cell context dCap3-calculate-transfer-crc or the cell batch line option calculate-transfer-crc must be set to true. The latter may as well be set in the *.poolist file. DCapProtocol_3_nio disables checksum calculation as soon as the mover receives a client command except ’write’ (e.g. read, seek or seek_and_write).
The checksum module (as part of the Pool) and its command subset (csm ...) determines the behavious of the checksum calculation.
csm set policy -ontransfer=on
Movers, implementing the ChecksumMover interface, are requested to calculate the Transfer Checksum. Whether or not the mover actually performance the calculation might depend on additional, mover specific flags, like the dCap3-calculate-transfer-crc flag for the DCapProtocol_3_nio mover.
If the mover reports the Transfer Checksum and there is a Client Checksum available, either from
pnfs
or from the mover protocol, the Transfer Checksum and the Client Checksum are compared. A mismatch will result in a CRC Exception .If there is no Client Checksum available whatsoever, the Transfer Checksum is stored in
pnfs
.csm set policy -onwrite=on
After the dataset has been completely and successfully written to disk, the pool calculates the checksum based on the disk file (Server File Checksum). The result is compared to either the Client Checksum or the Transfer Checksum and a CRC Exception is thrown in case of a mismatch.
If there is neither the Client Checksum nor the Transfer Checksum available, the Server File Checksum is stored in
pnfs
.csm set policy -enforcecrc=on
In case of -onwrite=off, this options enforces the calculation of the Server File Checksum ONLY if neither the Client Checksum nor the Transfer Checksum has been sucessfully calculated. The result is stored in
pnfs
.