Most of the calculations for genetic relationship matrices and their inverses utilized in recent (*t* + 1) evaluations for populations of interest can be avoided by updating these matrices at *t* evaluations for only new animals. This study describes and develops existing methods for updating pedigree-based and marker-based relationship matrices and their inverses. Updating some of the matrices could benefit from parallel computing.

## Introduction

Best linear unbiased prediction (BLUP; Henderson 1973), genomic BLUP (GBLUP; VanRaden 2008), and single-step GBLUP (ssGBLUP; Aguilar et al. 2010; Christensen and Lund 2010) are the most common methods used for genetic and (or) genomic evaluation of livestock. Relationships among individuals are modelled via pedigree or genomic relationship matrices (**A** or **G**, respectively). The inverse of these matrices are required in BLUP and GBLUP, respectively. In ssGBLUP, both inverses together with the inverse of the pedigree relationship matrix for genotyped animals are required. A major computational burden for genetic evaluation systems is matrix inversion. The inversion of **G** has a cubic computational cost proportional to the number of genotyped animals (Meyer et al. 2013), using the conventional matrix inversion algorithms. However, the method for indirect inversion of **A** (Henderson 1975; Quaas 1976) has made its inversion possible at a linear cost. There is an opportunity to reduce the computation costs at time *t* + 1 evaluations by updating the relationship matrices and their inverses at time *t* for the information on the new animals. In this methodology note, methods of updating relationship matrices and their inverses are presented.

## Methods

### Updating G

The most common way of forming the genomic relationship matrix is **G** = *c***ZZ**′ (VanRaden 2008), where *c* = 1/[2Σ*p _{i}*(1 -

*p*)] for scaling

_{i}**G**to be analogous to

**A**,

*p*is the allele frequency at locus

_{i}*i*,

**Z**=

**M**- 2

**P**, where

**M**is the genotype matrix with genotypes at each loci coded as {0, 1 (heterozygote), 2},

**P**=

*1***′, where**

*p***is a column vector of**

*p**p*(from the previous evaluation), and

_{i}**is a column vector of ones with the order of the number of animals. Denoting old and young animals with 1 and 2, the genomic relationship matrix can be updated for new genotypes:**

*1*Matrices **G**_{12} and **G**_{22} are required for updating **G**:

where **M**_{2} and **P**_{2} are the rows of **M** and **P** for young animals. In comparison with **ZZ**′, **ZZ**_{2}′ has (*n*_{1} + *n*_{2})*mn*_{2} computational complexity (*O*) rather than (*n*_{1} + *n*_{2})^{2}*m*, where *n*_{1}, *n*_{2}, and *m* are the number of old genotyped animals, young genotyped animals, and markers, respectively. One unit of *O* is defined as a single arithmetic operation. Figure 1 shows (*a*) *O*(**ZZ**′) for *m* = 1 and (*b*) *O*(**ZZ**_{2}′)/*O*(**ZZ**′) = *n*_{2}/(*n*_{1} + *n*_{2}) for various *n*_{1} and *n*_{2}. For *n*_{2} < *n*_{1}, *O*(**ZZ**′) changed linearly by *n*_{2} and exponentially increased by increasing *n*_{1} (Fig. 1*a*). The benefit from updating **G** increased by increasing *n*_{1}/*n*_{2} (Fig. 1*b*).

Considering **P**′ = [**P**_{1}′ **P**_{2}′], if the old and new genotypes are from discrete populations or breeds, allele frequencies in **P**_{1} (rows of **P** for old genotypes) and in **P**_{2} are different (i.e., ** p** and

*p*_{2}). In that case,

Even for homogeneous populations, ** p** and

*p*_{2}can be different. If

*n*

_{2}is relatively large and deviations between

**and**

*p*

*p*_{2}are considerable,

**should be updated to**

*p**** = (**

*p**n*

_{1}

**p**+

*n*

_{2}

*p*_{2})/(

*n*

_{1}+

*n*

_{2}), and

*c*should be updated to

*c** (using

***) for both groups of animals. That updates eq. 2 and results in eq. 4:**

*p*Furthermore, allele frequencies in **G**_{11} should be updated, which is done by regressing **G**_{11} as if ** p*** was applied () to

**G**

_{11}(i.e., ), where

*b*=

*c**/

*c*and

*α*= 4

*c** [

***′**

*p**** -**

*p***′**

*p***- (**

*p**** -**

*p***)′**

*p***M**

_{1}′

**/**

*1**n*

_{1}]. The proof of

*α*is provided in the Appendix A. Like any regression, error terms are involved (in ). Therefore, is an approximation of

**G**

_{11}as if

*** was used instead of**

*p***. Vitezica et al. (2011) used a similar approach to regress**

*p***G**to

**A**

*(block of*

_{gg}**A**corresponding to genotyped animals) in the context of ssGBLUP to regress

**G**to the same base population as in

**A**

*. Given*

_{gg}**and**

*p****,**

*p**O*(

*α*) = (

*n*

_{1}+ 4)(

*m*+ 1) - 2, from which

*c**,

***′**

*p****, and**

*p***′**

*p***have computational complexities of**

*p**m*and

*O*[(

*** -**

*p***)′**

*p***M**

_{1}′] =

*m*+

*mn*

_{1}. Figure 2 shows the impact of different

*m*and

*n*

_{1}on

*O*(

*α*).

### Updating G^{-1}

Hager (1989) introduced the state-of-the-art method of updating the inverse of a matrix. Meyer et al. (2013) showed how this method can be used to update **G**^{-1}, which is presented below.

where **G**^{22} = (**G**_{22} - **QG**_{12})^{-1} and . There are other forms of presenting the updated **G**^{-1}, such as

and

where **I** is an identity matrix with the order of the number of new genotyped animals (*n*_{2}). The only matrix in need of inversion is **G**_{22} - **QG**_{12}, which is a small matrix because usually the number of additional genotypes is small. This method considers no changes in allele frequencies due to new genotypes. Assuming allele frequencies in the new genotyped animals causing change in the scale, but not the base of **G**, this method can be modified by obtaining **G**_{12} and **G**_{22} according to eq. 4, and multiplying by *c*/*c**.

The equation for updating **G**^{-1} (eq. 5) is similar to the equation for the inverse of the pedigree relationship matrix including phantom parent groups, introduced by Quaas (1988). By substituting **G**^{22} with **A**^{-1}, with **Φ**_{22}, and redefining **Q** as the relationship coefficient matrix between the base animals and their phantom parent groups, eq. 5 is then changed to

Considering **Φ**_{22} = **0**, this matrix becomes the same as the inverse of the pedigree relationship matrix including phantom parent groups (Quaas 1988). This assumption is true because there is no previous inverse for new animals (phantom parent groups) to be added.

Compared with inverting the whole **G**, updating **G**^{-1} reduces the matrix inversion complexity from (*n*_{1} + *n*_{2})^{3} to . However, there are additional complexities for matrix multiplication, equal to for **Q**, for **G**^{22}**Q**, for **Q**′**G**^{22}**Q**, and for matrix summations equal to *n*_{1} + *n*_{2}. The greater the *n*_{1}, and the smaller the *n*_{2}, the more the advantage in updating **G**^{-1} (Meyer et al. 2013). Figure 3 shows (*a*) *O*(**G**^{-1}) and (*b*) *O*(updating to **G**^{-1})/*O*(**G**^{-1}) = [ + *n*_{1}*n*_{2}(2*n*_{1} + *n*_{2}) + *n*_{1} + *n*_{2}]/(*n*_{1} + *n*_{2})^{3} with various *n*_{1} and *n*_{2}. Generally, computational complexity of **G**^{-1} increases exponentially by increasing *n*_{1} and *n*_{2}. However, with *n*_{2} smaller than *n*_{1}, the trend of *O*(**G**^{-1}) by *n*_{2} gets closer to linear (Fig. 3*a*). Larger *n*_{1} also caused *O*(**G**^{-1}) to increase at a higher rate by increasing *n*_{2}. The benefit from updating **G**^{-1} increased by increasing *n*_{1}/*n*_{2} (Fig. 3*b*). The strategy of updating **G**^{-1} can be applied for updating used in ssGBLUP (Aguilar et al. 2010; Christensen and Lund 2010). However, it would be more convenient to update , where **A*** ^{gg}* is the block of

**A**

^{-1}for genotyped animals. The matrix is used in ssGBLUP instead of

**A**

*in BLUP. According to Strandén and Mäntysaari (2014):*

^{gg}where *n* denotes non-genotyped animals. The **A*** ^{ng}* and

**A**

*blocks are easy to obtain. Instead of directly inverting*

^{nn}**A**

*, each column of (*

^{nn}**A**

*)*

^{nn}^{-1}

**A**

*can be obtained by solving the equation system*

^{ng}**A**

*=*

^{nn}**x****, where**

*y***and**

*x***are the**

*y**j*th columns of (

**A**

*)*

^{nn}^{-1}

**A**

*and*

^{ng}**A**

*, respectively. Therefore, to get an updated , the new genotyped animals are appended to*

^{ng}**A**

*in the above formula. However, this is assuming that the pedigree of the new genotyped animals does not contain any new non-genotyped animal.*

^{ng}A better approach for updating **G**^{-1} is using the algorithm for proven and young (APY), developed by Misztal et al. (2014). This algorithm is better in the sense that it is computationally more feasible, it has less noise associated with noncoding markers (Nilforooshan and Lee 2019), and it may overcome the singularity problem of **G**, especially when the number of genotyped animals reach the number of markers. The approximate of **G**^{-1} is calculated as (Misztal et al. 2014)

where **D**_{22} is a diagonal matrix with diagonal elements , where *g _{ii}* is the

*i*th diagonal element of

**G**

_{22},

*n*is the number of markers,

*m*is the genotype of the new individual

_{ij}*i*for marker

*j*, and

**g**

_{i}_{1}is the

*i*th row of

**G**

_{21}. Thus, given from the previous evaluation,

**G**

_{12}and

*g*elements are required, where

_{ii}**G**

_{12}=

*c*(

**M**

_{1}- 2

**P**

_{1})(

**M**

_{2}- 2

**P**

_{2})′, and . Please note that forming and inverting

**G**

_{22}are not required.

In APY, genotyped animals are divided into core and noncore animals and the direct inversion is only required for the block of **G** for core animals (Misztal et al. 2014). This method can overcome the computational challenges of inverting a large **G**. In addition, the closer the number of genotyped animals get to the number of markers, the numerical stability of **G** decreases until the number of genotyped animals exceed the number of markers. At this point, **G** would be singular. By updating **G**^{-1} using APY, previously genotyped animals are considered as core and the new genotyped animals are considered as noncore. If genotypes from the core animals provide enough information about the independent chromosome segments in the population, the core information is provided and the off-diagonals of **G**^{22} do not contribute to the accuracy and can be ignored. Thus, the genomic breeding value of noncore animals is conditioned to the genomic breeding value of core animals (Misztal 2016). Random cross-generation core definition has been found to perform well in APY (Ostersen et al. 2016; Bradford et al. 2017; Nilforooshan and Lee 2019). Updating **G**^{-1} consecutively with APY results in dropping the latest generations from the core sample, which may have unfavourable effects. Therefore, after one or a few (depending on the population) APY updates of **G**^{-1}, it is recommended to replace with a new **G**^{-1} or with a random core .

### Parallel processing

Computational time for matrix operations can be considerably reduced by parallel processing. Computational complexity of a *n*_{1} × *m* by *m* × *n*_{2} matrix multiplication is *n*_{1}*mn*_{2}. The *n*_{1}*mn*_{2} computational complexity can split across *n*_{1}, *n*_{2}, or *n*_{3} parallel processes, where *n*_{3} is an integer less than *n*_{1} and *n*_{2}. Matrix multiplication can be done in independent parallel processes; thus, the cost of communications across nodes is minimized. For example, row *i* of **AB** can be obtained independently from other rows of **AB** by multiplying row *i* of **A** to **B**, or column *j* of **AB** can be obtained independently from other columns of **AB** by multiplying **A** to column *j* of **B**. An R function for parallel processing of matrix multiplication is provided in Appendix B. Though not discussed here, parallel processing algorithms do exist for matrix inversion and are used in some genetic evaluation softwares.

### Updating A

Matrix **A** is not needed in genetic evaluation models and its calculation is computationally more difficult than calculating **A**^{-1}. However, it is usually needed for postevaluation procedures, such as designing breeding schemes, preserving genetic variation, and controlling inbreeding in the population. Updating **A** is not different from resuming an incomplete construction of **A** to its full matrix. The following algorithm demonstrates a procedure for updating **A**. Considering no pedigree errors (e.g., young animals being parents to old animals) and no pedigree correction, the pedigree file for young animals has only young animals in the first (animal ID) column. Any row corresponding to animals not among the young animals is deleted. Then, **A** is updated with the following simple rules:

For animals in the new pedigree with no parents:

While the new pedigree is not empty:

2.1. Find an animal with parent(s) not available in the first column of the new pedigree.

2.2. Following Emik and Terrill (1949), the relationship of animal

*i*with others (already) in**A**is the average of the relationship of its parents,*s*and*d*with those animals. The same is true for the relationship of the animal to its parents [*A*=(_{is}*A*+_{ss}*A*)/2,_{sd}*A*= (_{id}*A*+_{dd}*A*)/2], and_{sd}*A*= 1 +_{ii}*A*/2._{sd}2.3. Delete that animal from the new pedigree.

This technique can be used, not only for updating, but also for constructing **A** from scratch. Sorting animals by age is not needed as it is built into the algorithm by picking animals in the right order (parents before progeny). Instead of updating **A**, which may include millions of animals, a subset of it including the animals of interest (e.g., the last *x* generations) and the parents of the new animals (if known) can be updated.

There are available methods for calculating inbreeding coefficients (Tier 1990; Meuwissen and Luo 1992; Colleau 2002; Sargolzaei and Iwaisaki 2004; R.L. Quaas (1995), unpublished note) that can be adopted in updating situations (i.e., calculating inbreeding coefficients for young animals given inbreeding coefficients for old animals). Sargolzaei and Iwaisaki (2005) compared computational performance of these four algorithms. The algorithms of R.L. Quaas (1995, unpublished note) and Sargolzaei and Iwaisaki (2004) are modifications to the algorithm of Meuwissen and Luo (1992). The performance of the algorithms depended on the number of generations, population, and family sizes. The algorithm of Sargolzaei and Iwaisaki (2004) was faster than the other algorithms. Computation time considerably decreased in updating situations, in which the algorithm of Sargolzaei and Iwaisaki (2004) outperformed the other algorithms (Sargolzaei and Iwaisaki 2005). Colleau (2002) developed an indirect method for obtaining individual inbreeding coefficients and relationship statistics in the population. The method was indirect because instead of element-wise calculation of the relationship matrix, groups of elements were calculated simultaneously. The computational efficiency of the method was heavily reliant on the sparseness of the inverse of the relationship matrix.

Computational complexity of calculating or updating A depends on many factors and it differs from population to population and animal to animal. The computation involves finding the parents of the animal; searching for their relationships, the size, and the sparsity of **A**; the number of previous animals in the pedigree; the number of new animals; and the computational algorithm.

### Updating A^{-1}

Updating **A**^{-1} is not different from resuming an incomplete construction of **A**^{-1} to its full matrix. However, computationally, reading and updating an old **A**^{-1} is not justifiable over calculating **A**^{-1} from scratch. Calculation of **A**^{-1} is easier than the calculation of **A**. The elements of **A**^{-1} for an animal are conditional to its parents, whereas the elements of **A** for an animal are also conditional to the other relatives.

### Matrix storage

Conventionally, relationship matrices and their inverses are saved in a sparse upper- and (or) lower-triangle format with three columns for the ID of animals and the matrix element. It saves disk space and makes indexing matrix elements easy. However, symmetric dense matrices can benefit from being stored as an array of upper- and (or) lower-triangular values, which takes considerably less storage than a three-column data frame. In programming, this method of storing a matrix is called packed storage. This method is used in Fortran’s linear algebra libraries, BLAS, and LAPACK (Wikipedia Contributors 2013). For a packed matrix of length *n*, the matrix dimension is obtained as *N* = [√(1 + 8*n*) - 1]/2. For indexing, each row *i* starts with the [(*i* - 1)(*N* - *i*/2 + 1) + 1]th element and ends with the [(2*N* - *i* + 1)*i*/2]th element of the vector, and an element from the *i*th row and *j*th column of the matrix is located at the [(*i* - 1)(*N* - *i*/2 + 1) + 1 + |*i* - *j*|]th element of the vector.

### Updating vs. rebuilding

On deciding whether to update matrices or computing them from scratch, some consideration should be given to the following:

The drive’s read speed should be considered. Nowadays, there are commercial non-volatile memory express drives with a read speed over 5000 MB per second. There is less concern about the writing speed as it occurs after computations have been completed. In efficient programming, read-write instances are minimized.

The size and the sparsity of the matrix to be read compared with the size of the previous data file (e.g., pedigree or genotypes) to rebuild the matrix.

Data file format (e.g., binary vs. ASCII).

Data reading strategy (i.e., mapping the file into memory vs. reading the file into a buffer via a connection; reading a large file by memory mapping is faster). The pros and cons of specific data reading strategies are not in the scope of this paper.

Computational complexity of the calculations.

Central processing unit clock speed.

Single-core vs. parallel computing.

Storage vs. computation cost.

## Conclusion

An overview of the methods for updating different relationship matrices and their inverses was provided and possible improvements were proposed. There are possibilities for reducing required computational time and resources for calculating relationship matrices or their inverses by updating the previously calculated matrix for new animals and parallel computing. The downside of updating matrices is allocating disk space for saving the matrix at time *t* and reading it for the evaluation at time *t* + 1. Both can be reduced by saving the old matrix as half-stored (i.e., upper and (or) lower diagonal), sparse (i.e., skipping zero elements), and binary.

## Acknowledgements

The author acknowledges Livestock Improvement Corporation (LIC, Hamilton, NZ) for providing funds for the open access publication of this paper.

## References

*J. Dairy Sci.*93: 743–752. doi: https://doi.org/10.3168/jds.2009-2730. PMID:PMID:20105546. Google Scholar

*J. Anim. Breed. Genet.*134: 545–552. doi: https://doi.org/10.1111/jbg.12276. PMID:PMID:28464315. Google Scholar

*Genet. Sel. Evol.*42: 2. doi: https://doi.org/10.1186/1297-9686-42-2. PMID:PMID:20105297. Google Scholar

*Genet. Sel. Evol.*34: 409. doi: https://doi.org/10.1186/1297-9686-34-4-409. PMID:PMID:12270102. Google Scholar

*J. Hered.*40: 51–55. doi: https://doi.org/10.1093/oxfordjournals.jhered.a105986. PMID:PMID:18116093. Google Scholar

*SIAM Rev.*31: 221–239. doi: https://doi.org/10.1137/1031049. Google Scholar

*J. Anim. Sci.*1973: 10–41. doi: https://doi.org/10.1093/ansci/1973.symposium.10. Google Scholar

*J. Dairy Sci.*58: 1727–1730. doi: https://doi.org/10.3168/jds.s0022-0302(75)84776-x. Google Scholar

*Genet. Sel. Evol.*24: 305–313. doi: https://doi.org/10.1186/1297-9686-24-4-305. Google Scholar

*J. Anim. Sci.*91: 2583–2586. doi: https://doi.org/10.2527/jas.2012-6056. PMID:PMID:23508030. Google Scholar

*Genetics,*202: 401–409. doi: https://doi.org/10.1534/genetics.115.182089. PMID:PMID:26584903. Google Scholar

*J. Dairy Sci.*97: 3943–3952. doi: https://doi.org/10.3168/jds.2013-7752. PMID:PMID:24679933. Google Scholar

*J. Anim. Sci.*97: 1090–1100. doi: https://doi.org/10.1093/jas/skz010. PMID:PMID:30624671. Google Scholar

*Genet. Sel. Evol.*48: 48. doi: https://doi.org/10.1186/s12711-016-0227-8. PMID:PMID:27357825. Google Scholar

*Biometrics,*32: 949–953. doi: https://doi.org/10.2307/2529279. Google Scholar

*J. Dairy Sci.*71: 1338–1345. doi: https://doi.org/10.3168/jds.s0022-0302(88)79691-5. Google Scholar

*Jpn. J. Biom.*25: 25–36. doi: https://doi.org/10.5691/jjb.25.25. Google Scholar

*Anim. Sci. J.*76: 401–406. doi: https://doi.org/10.1111/j.1740-0929.2005.00282.x. Google Scholar

*Genet. Sel. Evol.*22: 419–430. doi: https://doi.org/10.1186/1297-9686-22-4-419. Google Scholar

*J. Dairy Sci.*91: 4414–4423. doi: https://doi.org/10.3168/jds.2007-0980. PMID:PMID:18946147. Google Scholar

*Genet. Res.*93: 357–366. doi: https://doi.org/10.1017/s001667231100022x. PMID:PMID:21767459. Google Scholar

## Appendices

### Appendix A

Assuming a **G** (genomic relationship) matrix available on *n* number of genotypes in an **M** genotype matrix, before expanding **G** for new genotypes, it needs to be corrected to **G*** (same size as **G**) for allele frequencies (** p**) to be changed to

*** due to the new genotypes. The aim is to obtain regression coefficients**

*p**α*and

*b*based on

**and**

*p**** to predict (a regression, not an exact estimation)**

*p***G*** as

**G**if

*** was used instead of**

*p***(**

*p**p*= allele frequency for marker

_{k}*k*in

**).**

*p***G** = *c***ZZ**′, *c* = 1/[2∑*p _{k}*(1 -

*p*)],

_{k}**G*** =

*c**

**Z***

**Z***′, and

*c** = 1/[2 ∑

*p** (1 -

_{k}*p**)]. Thus,

_{k}∑_{i} ∑_{j} *G _{ij}** = ∑

_{i}∑

_{j}

*G*, to obtain the unknown independent

_{ij}**G***, it is assumed that the slope of

**G*** on

**G**is due to the applied coefficients (i.e.,

*b*=

*c**/

*c*). Thus,

Similarly, **Z*** = **M** - 2**P*** and **P*** = 1** p***′. Therefore,

Substituting the above line in the formula for *α* gives

### Appendix B

R function for parallel processing of matrix multiplication:

`library(“parallel”)`

`mmultpar <- function(A, B, ncl) {`

` if(ncol(A)!=nrow(B)) stop(“ERROR: Dimension mis-match”)`

` if(ncl < 1) stop(“ERROR: ncl should be a positive integer.”)`

` if((ncl < nrow(A)) & (ncl < ncol(B))) {`

` cl = makeCluster(ncl)`

` Alist = lapply(splitIndices(nrow(A), length(cl)), function(x) A[x,,drop=FALSE])`

` ans = clusterApply(cl, Alist, get(“%*%”), B)`

` return(do.call(rbind, ans))`

` } else if (nrow(A) > ncol(B)) {`

` cl = makeCluster(ncol(B))`

` Blist = lapply(1:ncol(B), function(x) t(B)[x,,drop=FALSE])`

` ans = clusterApply(cl, Blist, get(“%*%”), t(A))`

` return(t(do.call(rbind, ans)))`

` } else {`

` cl = makeCluster(nrow(A))`

` Alist = lapply(1:nrow(A), function(x) A[x,,drop=FALSE])`

` ans = clusterApply(cl, Alist, get(“%*%”), B)`

` return(do.call(rbind, ans))`

` }`

`}`

Three objects are required: (*i*) the left-side matrix (`A`), (*ii*) the right-side matrix (`B`), and (*iii*) the user-defined number of cluster nodes (`ncl`).

The processes are independent, and `ncl` is not limited to the number of available nodes on the computer. If `ncl > max(nrow(A), ncol(B)), min(nrow(A), ncol(B))` is applied as the number of cluster nodes.