release notes | Book: 1.9.5, 1.9.12 (opt, FHS), 2.11 (FHS), 2.12 (FHS), 2.13 (FHS), 2.14 (FHS), | Wiki | Q&A black_bg
Web: Multi-page, Single page | PDF: A4-size, Letter-size | eBook: epub black_bg

Checksums in detail

[return to top]

Overview

When writing data into the dCache, and possibly later on into an HSM, checksums may be calculated at different points within this chain.

Client Checksum

The client calculates the checksum before or while the data is sent to the dCache. The checksum value, depending on when it has been calculated, may sent together with the open request to the door and stored into pnfs before the data transfer begins or it may be sent with the close operation after the data has been transferred.

The dCap protocol providing both methods, but the dCap clients use the latter by default.

The FTP protocol does not provide a mechanism to send a checksum. Nevertheless, some FTP clients can (mis-)use the site command to send the checksum prior to the actual data transfer.

Transfer Checksum

While data is coming in, the server data mover may calculate the checksum on the fly.

Server File Checksum

After all the file data has been received by the dCache server and the file has been fully written to disk, the server may calculate the checksum, based on the disk file.

The graph below sketches the different schemes for dCap and FTP with and without client checksum calculation:

Table 20.1. Checksum calculation flow

StepFTP (w/o initial CRC)FTP (with initial CRC)dCap
1Create Entry
2Store Client CRC in pnfs
3Server calculates transfer CRC
4Get Client CRC from pnfsGet Client CRC from mover
5Compare Client and Server CRC
6Store transfer CRC in pnfsStore client CRC in pnfs
7Server calculates disk file CRC

[return to top]

ChecksumMover Interface

As far as the server data mover is concerned, only the Client Checksum and the Transfer Checksum are of interrest. While the client checksum is just delivered to the server mover as part of the protocol (e.g. close operation for dCap), the transfer checksum has to be calcalated by the server mover on the fly. In order to communicate the different checksums to the embedding pool, the server mover has to implement the ChecksumMover interface in addition to the MoverProtocol Interface. A mover, not implementing the MoverProtocol is assumed not to handle checksums at all. The Disk File Checksum is calculated independedly of the mover within the pool itself.

public interface ChecksumMover {

        public void     setDigest( Checksum transferChecksum ) ;
        public Checksum getClientChecksum() ;
        public Checksum getTransferChecksum() ;

}

The pool will or will not call the setDigest method to advise the mover which checksum algorithm to use. If setDigest is not called, the mover is not assumed to calculate the Transfer Checksum.

java.security.MessageDigest transferDigest = transferChecksum.getMessageDigest() ;

                ***

        while( ... ){

                rc = read( buffer , 0 , buffer.length ) ;

                ***

                transferDigest.update( buffer , 0 , rc ) ;
        }

getClientChecksum and getTransferChecksum are called by the pool after the MoverProtocols runIO method has been successfully processed. These routines should return null if the corresponding checksum could not be determined for whatever reason.

public void  setDigest( Checksum transferChecksum ){

        this.transferChecksum = transferChecksum ;

        }
        public Checksum getClientChecksum(){
                return clientChecksumString == null ?
                        null :
                        Checksum( clientChecksumString ) ;
        }
        public Checksum getTransferChecksum(){ return transferChecksum ; }

[return to top]

The DCapProtocol_3_nio Mover

The DCapProtocol_3_nio mover implements the ChecksumMover interface and is able to report the Client Checksum and the Transfer Checksum to the pool. To enable the DCapProtocol_3_nio Mover to calculate the Transfer Checksum, either the cell context dCap3-calculate-transfer-crc or the cell batch line option calculate-transfer-crc must be set to true. The latter may as well be set in the *.poolist file. DCapProtocol_3_nio disables checksum calculation as soon as the mover receives a client command except ’write’ (e.g. read, seek or seek_and_write).

[return to top]

The ChecksumModule

The checksum module (as part of the Pool) and its command subset (csm ...) determines the behavious of the checksum calculation.

  • csm set policy -ontransfer=on

    Movers, implementing the ChecksumMover interface, are requested to calculate the Transfer Checksum. Whether or not the mover actually performance the calculation might depend on additional, mover specific flags, like the dCap3-calculate-transfer-crc flag for the DCapProtocol_3_nio mover.

    If the mover reports the Transfer Checksum and there is a Client Checksum available, either from pnfs or from the mover protocol, the Transfer Checksum and the Client Checksum are compared. A mismatch will result in a CRC Exception .

    If there is no Client Checksum available whatsoever, the Transfer Checksum is stored in pnfs.

  • csm set policy -onwrite=on

    After the dataset has been completely and successfully written to disk, the pool calculates the checksum based on the disk file (Server File Checksum). The result is compared to either the Client Checksum or the Transfer Checksum and a CRC Exception is thrown in case of a mismatch.

    If there is neither the Client Checksum nor the Transfer Checksum available, the Server File Checksum is stored in pnfs.

  • csm set policy -enforcecrc=on

    In case of -onwrite=off, this options enforces the calculation of the Server File Checksum ONLY if neither the Client Checksum nor the Transfer Checksum has been sucessfully calculated. The result is stored in pnfs.