From the allowable pools as determined by the pool selection unit, the pool manager determines the pool used for storing or reading a file by calculating a cost value for each pool. The pool with the lowest cost is used.
If a client requests to read a file which is stored on more than one allowable pool, the performance costs are calculated for these pools. In short, this cost value describes how much the pool is currently occupied with transfers.
If a pool has to be selected for storing a file, which is either written by a client or restored from a tape backend, this performance cost is combined with a space cost value to a total cost value for the decision. The space cost describes how much it “hurts” to free space on the pool for the file.
The cost module is responsible for calculating the cost values for all pools. The pools regularly send all necessary information about space usage and request queue lengths to the cost module. It can be regarded as a cache for all this information. This way it is not necessary to send “get cost” requests to the pools for each client request. The cost module interpolates the expected costs until a new precise information package is coming from the pools. This mechanism prevents clumping of requests.
Calculating the cost for a data transfer is done in two steps. First, the cost module merges all information about space and transfer queues of the pools to calucate the performance and space costs separately. Second, in the case of a write or stage request, these two numbers are merged to build the total cost for each pool. The first step is isolated within a separate loadable class. The second step is done by the cost module.
The load of a pool is determined by comparing the current number of active and waiting transfers to the maximum number of concurrent transfers allowed. This is done separately for each of the transfer types (store, restore, pool-to-pool client, pool-to-pool server, and client request) with the following equation:
perfCost(per Type) = ( activeTransfers + waitingTransfers ) / maxAllowed .
The maximum number of concurrent transfers (maxAllowed) can be configured with the commands st set max active (store), rh set max active (restore), mover set max active (client request), mover set max active -queue=p2p (pool-to-pool server), and pp set max active (pool-to-pool client).
Then the average is taken for each mover type where maxAllowed is not zero. For a pool where store, restore and client transfers are allowed, e.g.,
perfCost(total) = ( perfCost(store) + perfCost(restore) + perfCost(client) ) / 3 ,
and for a read only pool:
perfCost(total) = ( perfCost(restore) + perfCost(client) ) / 2 .
For a well balanced system, the performance cost should not exceed 1.0.
In this section only the new scheme for calculating the space cost will be described. Be aware, that the old scheme will be used if the breakeven parameter of a pool is larger or equal 1.0.
The cost value used for determining a pool for storing a file depends either on the free space on the pool or on the age of the least recently used (LRU) file, which whould have to be deleted.
The space cost is calculated as follows:
If | freeSpace > gapPara | then | spaceCost = 3 * newFileSize / freeSpace | ||
If | freeSpace <= gapPara | and | lruAge < 60 | then | spaceCost = 1 + costForMinute |
If | freeSpace <= gapPara | and | lruAge >= 60 | then | spaceCost = 1 + costForMinute * 60 / lruAge |
where the variable names have the following meanings:
- freeSpace
The free space left on the pool
- newFileSize
The size of the file to be written to one of the pools, and at least 50MB.
- lruAge
The age of the least recently used file on the pool.
- gapPara
The gap parameter. Default is 4GB. The size of free space below which it will be assumed that the pool is full and consequently the least recently used file has to be removed. If, on the other hand, the free space is greater than
gapPara
, it will be expensive to store a file on the pool which exceeds the free space.It can be set per pool with the set gap command. This has to be done in the pool cell and not in the pool manager cell. Nevertheless it only influences the cost calculation scheme within the pool manager and not the bahaviour of the pool itself.
- costForMinute
A parameter which fixes the space cost of a one-minute-old LRU file to (1 + costForMinute). It can be set with the set breakeven, where
costForMinute = breakeven * 7 * 24 * 60.
I.e. the the space cost of a one-week-old LRU file will be (1 + breakeven). Note again, that all this only applies if breakeven < 1.0
The prescription above can be stated a little differently as follows:
If | freeSpace > gapPara | then | spaceCost = 3 * newFileSize / freeSpace |
If | freeSpace <= gapPara | then | spaceCost = 1 + breakeven * 7 * 24 * 60 * 60 / lruAge , |
where newFileSize
is at least 50MB and
lruAge
at least one minute.
As the last version of the formula suggests, a pool can be in two states: Either freeSpace > gapPara or freeSpace <= gapPara - either there is free space left to store files without deleting cached files or there isn’t.
Therefore, gapPara
should be around the
size of the smallest files which frequently might be written
to the pool. If files smaller than
gapPara
appear very seldom or never, the
pool might get stuck in the first of the two cases with a
high cost.
If the LRU file is smaller than the new file, other files might have to be deleted. If these are much younger than the LRU file, this space cost calculation scheme might not lead to a selection of the optimal pool. However, in praxis this happens very seldomly and this scheme turns out to be very efficient.
The total cost is a linear combination of the performance and
space cost.
I.e.
totalCost = ccf * perfCost + scf * spaceCost ,
where ccf
and scf
are
configurable with the command set pool decision.
E.g.,
(PoolManager) admin >
set pool decision
-spacecostfactor=3
-cpucostfactor=1
will give the space cost three times the weight of the performance cost.