Glam Prestige Journal

Bright entertainment trends with youth appeal.

$\begingroup$

I am trying to invert a matrix using Woodbury identity. The inversion using Cholesky decomposition has the following pseudo-code:

For $t=1,2,...$

$(1)\;\; \text{Read}\;x_t\in\mathbb{R}^n$

$(2)\;\;D_{t-1}=diag(|\theta_t^1|,...,|\theta_t^n|)$

$(3)\;\;A_t=A_t+x_tx_t'$

$(4)\;\;A_{t}^{-1}=\sqrt{D_{{t-1}}}\left(a\mathbf{I}+\sqrt{D_{{t-1}}}A_t\sqrt{D_{{t-1}}}\right)^{-1}\sqrt{D_{{t-1}}}$

$(5)\;\;\text{Read}\;y_t\in\mathbb{R}$

$(6)\;\;b=b+y_tx_t$

$(7)\;\;\theta_t=A_{t}^{-1}b$

End For

The well known application of Sherman-Morrison on Recursive Least Squares is as follows:

$$A_t^{-1}=(aI+x_tx_t')^{-1}=A_{t-1}^{-1} - \frac{(A_{t-1}^{-1}x_t)(A_{t-1}^{-1}x_t)'}{1+x_t'A_{t-1}^{-1}x_t}$$

where $A_0^{-1}=\frac{1}{a}I$ and we can set $A_t = A_{t-1}+\sum_{t=1}^Tx_tx_t'$, which will lead to time complexity of $O(n^2)$.

The above technique is mentioned here. The two implementation in $\texttt{R}$ are as follows:

X <-matrix(runif(1000),20,10)
Y<-rnorm(20)
a<- 0.1
Cholsky<-function(X,Y,a){ X <- as.matrix(X) Y <- as.matrix(Y) T <- nrow(X) N <- ncol(X) aI<- diag(a,N) bt<- matrix(0,ncol=1,nrow=N) for (t in 1:T){ xt<-X[t,] At <- aI + (xt %*% t(xt)) InvA<-chol2inv(chol(At)) bt <- bt + (Y[t] * xt) theta<- InvA %*% bt } return(theta)
}
Cholsky(X,Y,a)
Morrison<-function(X,Y,a){ X <- as.matrix(X) Y <- as.matrix(Y) T <- nrow(X) N <- ncol(X) At<-diag(1/a,N) bt<- matrix(0,ncol=1,nrow=N) for (t in 1:T){ xt<-X[t,] At <- At + (xt %*% t(xt)) InvA <- At - ((t(xt%*%At)%*%(as.matrix(xt%*%At))) /as.numeric(xt%*%At%*%xt+1)) bt <- bt + (Y[t] * xt) theta<- InvA %*% bt } return(theta)
}
Morrison(X,Y,a)

They don't give the same result. So, perhaps I should not expect the implementations to be equivalent.

I was wondering if I could invert the following (for the above case) more efficiently:

$$A_t^{-1}=\left(D_{t-1}^{\frac{1}{2}}A_tD_{t-1}^{\frac{1}{2}}+aI\right)^{-1}$$

where $A_t=A_{t-1}+x_tx_t'$ and $A_0=\mathbf{0}$.

Essentially, I want to invert:

$$M_t=\left(a\mathbf{I}+\sqrt{D_{{t-1}}}x_tx_t'\sqrt{D_{{t-1}}}\right)$$

I say:

$$M_{t}^{-1}=M_{t-1}^{-1}-\frac{(M_{t-1}D_{t-1}^{\frac{1}{2}}x_t)(M_{t-1}D_{t-1}^{\frac{1}{2}}x_t)'}{1+x_t'D_{t-1}^{\frac{1}{2}}M_{t-1}D_{t-1}^{\frac{1}{2}}x_t}$$

where $D_0=diag(\mathbf{1}$). So, $M^{-1}_0=\frac{1}{a}I$

RLS identity given above $(aI+u_tv_t)^{-1}$ uses $u=A_{t-1}^{-1}x_t,v_t=u_t'$, I am using $u=M_{t-1}^{-1}D_{t-1}^{\frac{1}{2}}x_t,v_t=u_t'$

One may write the implementations in $\texttt{R}$ as follows:

X <-matrix(runif(1000),20,10)
Y<-rnorm(20)
a<- 0.1
#Cholesky implementation
X <- as.matrix(X)
Y <- as.matrix(Y)
T <- nrow(X)
N <- ncol(X)
bt<- matrix(0,ncol=1,nrow=N)
At<- diag(0,N)
I<- diag(a,N);Mt<-diag(1/a,N)
theta0<- rep(1,N)
for (t in 1:2){ xt<-X[t,] Dt <- diag(sqrt(abs(as.numeric(theta0)))) At <- At + (xt %*% t(xt)) Mt <- I + (Dt%*%At%*%Dt) InvA <- chol2inv(chol( Mt )) AAt<- Dt %*%InvA%*% Dt bt <- bt + (Y[t] * xt) theta0 <- AAt %*% bt print(theta0)
}

Above is the correct implementation of the pseudo code. If I swap the following lines. I don't get the same answer.

Mt <- Mt + (Dt%*%At%*%Dt)
InvA<- Mt - ((t(xt%*%Dt%*%Mt)%*%(as.matrix(xt%*%Dt%*%Mt))) /as.numeric(xt%*%Dt%*%Mt%*%t(as.matrix(xt%*%Dt))+1))

Why is that?

$\endgroup$ 10 Reset to default

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy