Cholesky Factor Investigation

Cholesky Factor Investigation

In this notebook I look at a few (4) cholesky factors, used to whiten KSE data prior to implementing a least dquares solver to compute the Wiener filter.

The sizes of the Cholesky factors are 50, 100, 200, and 500.

Below is a section for each size. Within each section are a number of plots of (1) the the powerspectrum of various columns of the whitened design matrix and (2) the above diagonal columns of the Cholesky factors (C, C100, C200, C500). Some section also contain calculations for the infinity norm of the difference between corresponding submatrices of the various Cholesky factors.

The data suggests that

  • there is convergence of the whitening filter exracted from columns of the cholesky factor.
  • Larger Cholesky factors reproduce the entryies of smaller factors.
In [1]:
using JLD
using LinearAlgebra
using Statistics
using PyPlot

at = include("../../../../Tools/AnalysisToolbox.jl")
Unable to init server: Could not connect: Connection refused
Unable to init server: Could not connect: Connection refused

(.:118195): Gdk-CRITICAL **: 13:22:05.118: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed

(.:118195): Gdk-CRITICAL **: 13:22:05.122: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed
Out[1]:
Main.AnalysisToolbox
In [2]:
function get_hash(pred::Array{<:Number,2};M_out = 30) where T<: Number;
    nu, steps = size(pred)

    pred = Array(transpose(pred))

    PRED = zeros(ComplexF64,steps-M_out+1,M_out*nu)
    for m = 1:M_out
        PRED[:,nu*(M_out-m)+1:nu*(M_out-m+1)] = @view pred[m:(steps + m - M_out),:]
    end

    PRED = reverse(PRED,dims = 2)
    PRED
end
Out[2]:
get_hash (generic function with 1 method)
In [3]:
# Load Data

#This is writen to allow for loading any of 5 different runs of the KSE solution

geni = 1
gen = "lin1e5_r$geni"     # this is just a reference designation it shows up in the
                          # output file. I think of generatrion.

savefile_pred = "../../../../../data/KSE_Data/ks_pred_$gen.jld"
println("Sol load location: " * savefile_pred)

Pred = load(savefile_pred,"pred");
Sig  = load(savefile_pred,"sig")

sig = Sig[1:1,:]
pred = Pred[1:1,:]

size(pred), size(sig)
Sol load location: ../../../../../data/KSE_Data/ks_pred_lin1e5_r1.jld
Out[3]:
((1, 499999), (1, 499999))

M = 50

In [4]:
function get_cholfact(pred, M_out; pcond = 1, eps = 1e-5, verb = true)
    Xh = get_hash(pred; M_out = M)

    w = I

    for i = 1:pcond
        σ = eps^2*var(Xh[:,M])
        A = Xh'Xh/size(Xh,1) + σ*I
        verb && println("preconditioning $i original condition number: ",cond(A))

        println("Information on the factoring of the approximated covariance matrix:")
        @timev L = cholesky(A).L
        verb && println("preconditioning $i Cholesky condition number: ",cond(L))

        W = inv(conj(L))
        w = W*w

        Xh  = Xh*transpose(W)
        verb && println("preconditioning $i new condition number: ",cond(Xh))
    end

    C = transpose(w);
    Xh, C
end
M = 50

Xh, C = get_cholfact(pred, M);
preconditioning 1 original condition number: 4.515891087678836e11
Information on the factoring of the approximated covariance matrix:
  0.153910 seconds (551.43 k allocations: 29.776 MiB)
elapsed time (ns): 153909531
bytes allocated:   31222302
pool allocs:       551342
non-pool GC allocs:85
malloc() calls:    2
preconditioning 1 Cholesky condition number: 672003.083538887
preconditioning 1 new condition number: 2.1113371879535713e10

Whitened Processes

In [5]:
function plot_proc(Xh, M_out; Nex = 2^13, Θ = 2π*(0:Nex-1)/Nex)
    for j = 1:10:M_out
        figure()
        for i = 0:3:10
            title("Power spectrum of various columns of whitened design matrix ($M_out × $M_out)")
            S = at.z_crossspect_dm(Xh[:,i+j],Xh[:,i+j]; Nex)
            loglog(Θ,real(S), label = "Column $(i+j)")
            ylabel("Power")
            xlabel("Frequency")
            legend()
        end
    end
end

plot_proc(Xh, M)

Whitening Filters

In [6]:
function plot_WF(C, M_out)
    for j = 1:10:M_out
        figure()
        for i = 0:3:10
            title("Relevent portions (diagonal upwards) of various columns Cholesky factor ($M_out × $M_out)")
            plot(real(C[i+j:-1:1,i+j]), label = "Column $(i+j)")
            ylabel("Real part of coefficient")
            xlabel("lag")
            legend()
        end
    end
end

plot_WF(C, M)

M = 100

In [7]:
M = 100

Xh, C100 = get_cholfact(pred, M);
preconditioning 1 original condition number: 7.193660573312194e11
Information on the factoring of the approximated covariance matrix:
  0.000297 seconds (8 allocations: 312.750 KiB)
elapsed time (ns): 296964
bytes allocated:   320256
pool allocs:       6
malloc() calls:    2
preconditioning 1 Cholesky condition number: 848156.2347521723
preconditioning 1 new condition number: 2.1258668128362022e10

Whitened Processes

In [8]:
plot_proc(Xh, M)

Whitening Filters

In [9]:
plot_WF(C100, M)
In [10]:
function plot_WF_comp(C1,C2)
    m1 = size(C1,1)
    m2 = size(C2,1)
    for j = 1:10:min(m1,m2)
        figure()
        for i = 0:3:10
            title("Overlay of whitening filters from corresponding columns of ($m1 ×$m1 solid) and ($m2 × $m2 dotted) cholesky factors")
            plot(real(C1[i+j:-1:1,i+j]), label = "Column $(i+j)")
            plot(real(C2[i+j:-1:1,i+j]),":", label = "Column $(i+j)")
            ylabel("Real part of coefficient")
            xlabel("lag")
            legend()
        end
    end
end

plot_WF_comp(C,C100)

M = 200

In [11]:
M = 200

Xh, C200 = get_cholfact(pred, M);
preconditioning 1 original condition number: 8.871032880972386e11
Information on the factoring of the approximated covariance matrix:
  0.000609 seconds (8 allocations: 1.221 MiB)
elapsed time (ns): 609123
bytes allocated:   1280256
pool allocs:       6
malloc() calls:    2
preconditioning 1 Cholesky condition number: 941831.4839237581
preconditioning 1 new condition number: 2.1479381762691734e10

Whitened Processes

In [12]:
plot_proc(Xh, M)

Whitening Filters

In [13]:
plot_WF(C200, M)
In [14]:
plot_WF_comp(C,C200)
In [15]:
plot_WF_comp(C100,C200)
In [16]:
norm(C - C100[1:50,1:50], Inf)
Out[16]:
16.178588059907387
In [17]:
norm(C - C200[1:50,1:50], Inf)
Out[17]:
16.048321291550536
In [18]:
norm(C100 - C200[1:100,1:100], Inf)
Out[18]:
10.986768494740689
In [19]:
norm(C100[1:50,1:50] - C200[1:50,1:50], Inf)
Out[19]:
10.707545606778593
In [20]:
norm(C100[51:100,51:100] - C200[51:100,51:100], Inf)
Out[20]:
10.986768494740689
In [21]:
norm(C200[101:150,101:150] - C200[151:200,151:200], Inf)
Out[21]:
123.98194290804581
In [22]:
var(C[:]),var(C100[:]), var(C200[:])
Out[22]:
(3.0091731925125976e9, 1.5747567285051358e9, 8.04880621177596e8)

M = 500

In [23]:
M = 500

Xh, C500 = get_cholfact(pred, M);
preconditioning 1 original condition number: 1.337522166745713e12
Information on the factoring of the approximated covariance matrix:
  0.003172 seconds (8 allocations: 7.630 MiB)
elapsed time (ns): 3171829
bytes allocated:   8000256
pool allocs:       6
malloc() calls:    2
preconditioning 1 Cholesky condition number: 1.156459398364714e6
preconditioning 1 new condition number: 2.1740673583840195e10

Whitened Processes

In [24]:
plot_proc(Xh, M)

Whitening Filters

In [25]:
plot_WF(C500, M)
In [26]:
plot_WF_comp(C,C500)
In [27]:
plot_WF_comp(C100,C500)
In [28]:
plot_WF_comp(C200,C500)
In [29]:
var(C[:]),var(C100[:]), var(C200[:]), var(C500[:])
Out[29]:
(3.0091731925125976e9, 1.5747567285051358e9, 8.04880621177596e8, 3.262008162253893e8)
In [30]:
var(real(C[:])),var(real(C100[:])), var(real(C200[:])), var(real(C500[:]))
Out[30]:
(3.00917316554173e9, 1.5747566987232587e9, 8.048806084437557e8, 3.2620054497670525e8)
In [31]:
var(imag(C[:])),var(imag(C100[:])), var(imag(C200[:])), var(imag(C500[:]))
Out[31]:
(26.97086807873767, 29.781876773132623, 12.73384018298806, 271.2486839247289)
In [32]:
mean(imag(C[:])),mean(imag(C100[:])), mean(imag(C200[:])), mean(imag(C500[:]))
Out[32]:
(6.252999039963161e-5, 4.336223303895026e-5, 3.19063683587159e-5, 1.8619458452274962e-5)