dNTF
)In this vignette, we consider approximating a non-negative tensor as a product of binary or non-negative low-rank matrices (a.k.a., factor matrices).
Test data is available from toyModel
.
You will see that there are four blocks in the data tensor as follows.
To decompose a binary tensor (𝒳), non-negative CP decomposition (a.k.a. non-negative tensor factorization; NTF (Cichocki 2007; CICHOCK 2009)) can be applied. NTF appoximates 𝒳 (N × M × L) as the mode-product of a core tensor S (J × J × J) and factor matrices A1 (J × N), A2 (J × M), and A3 (J × L).
𝒳 ≈ 𝒮×1A1×2A2×3A3 s.t. 𝒮 ≥ 0, Ak ≥ 0 (k = 1…3)
Note that _{k} is the mode-k product (CICHOCK 2009) and the core tensor S has non-negative values only in
the diagonal element. For the details, see NTF
function of
nnTensor
package.
In BTF, a rank parameter J
( ≤ min (N, M)) is
needed to be set in advance. Other settings such as the number of
iterations (num.iter
) or factorization algorithm
(algorithm
) are also available. For the details of
arguments of dNTF, see ?dNTF
. After the calculation,
various objects are returned by dNTF
. BTF is achieved by
specifying the binary regularization parameter as a large value like the
below:
set.seed(123456)
out_dNTF <- dNTF(X, Bin_A=c(1e+2, 1e+2, 1e+2), algorithm="KL", rank=4)
str(out_dNTF, 2)
## List of 6
## $ S : num [1:4] 2.24 2.23 2.24 2.24
## $ A :List of 3
## ..$ : num [1:4, 1:30] 9.99e-01 2.22e-16 2.22e-16 1.00 9.99e-01 ...
## ..$ : num [1:4, 1:30] 1.00 2.22e-16 2.22e-16 2.22e-16 1.00 ...
## ..$ : num [1:4, 1:30] 4.47e-01 9.94e-17 9.93e-17 9.93e-17 4.47e-01 ...
## $ RecError : Named num [1:28] 1.00e-09 2.67e+01 2.45e+01 2.36e+01 2.27e+01 ...
## ..- attr(*, "names")= chr [1:28] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:28] 1.00e-09 2.67e+01 2.45e+01 2.36e+01 2.27e+01 ...
## ..- attr(*, "names")= chr [1:28] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:28] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:28] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:28] 1.00e-09 2.56e-02 8.80e-02 4.16e-02 3.89e-02 ...
## ..- attr(*, "names")= chr [1:28] "offset" "1" "2" "3" ...
The reconstruction error (RecError
) and relative error
(RelChange
, the amount of change from the reconstruction
error in the previous step) can be used to diagnose whether the
calculation is converged or not.
layout(t(1:2))
plot(log10(out_dNTF$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_dNTF$RelChange[-1]), type="b", main="Relative Change")
The product of core tensor S and factor matrices Ak shows whether
the original data is well-recovered by dNTF
.
The histograms of Aks show that all the factor matrices Ak looks binary.
layout(t(1:3))
hist(out_dNTF$A[[1]], main="A1", breaks=100)
hist(out_dNTF$A[[2]], main="A2", breaks=100)
hist(out_dNTF$A[[3]], main="A3", breaks=100)
Here, we define this formalization as semi-binary tensor factorization (SBTF). SBTF can capture discrete patterns from non-negative matrices.
To demonstrate SBMF, next we use a non-negative tensor from the
nnTensor
package. You will see that there are four blocks
in the data tensor as follows.
In SBTF, a rank parameter J
( ≤ min (N, M)) is
needed to be set in advance. Other settings such as the number of
iterations (num.iter
) or factorization algorithm
(algorithm
) are also available. For the details of
arguments of dNTF, see ?dNTF
. After the calculation,
various objects are returned by dNTF
. SBTF is achieved by
specifying the binary regularization parameter as a large value like the
below:
set.seed(123456)
out_dNTF2 <- dNTF(X2, Bin_A=c(1e+5, 1e+5, 1e-10), algorithm="KL", rank=4)
str(out_dNTF2, 2)
## List of 6
## $ S : num [1:4] 13.1 31.7 112.1 1474.1
## $ A :List of 3
## ..$ : num [1:4, 1:30] 0.00704 0.00175 0.47548 0.00303 0.00653 ...
## ..$ : num [1:4, 1:30] 0.00905 0.00602 0.10119 0.00226 0.0092 ...
## ..$ : num [1:4, 1:30] 0.1385 0.2206 0.0447 0.0048 0.1523 ...
## $ RecError : Named num [1:101] 1.00e-09 2.46e+04 3.90e+03 2.96e+03 6.38e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:101] 1.00e-09 2.46e+04 3.90e+03 2.96e+03 6.38e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:101] 1.00e-09 8.23e-01 5.30 3.19e-01 5.36e-01 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
RecError
and RelChange
can be used to
diagnose whether the calculation is converged or not.
layout(t(1:2))
plot(log10(out_dNTF2$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_dNTF2$RelChange[-1]), type="b", main="Relative Change")
The product of core tensor S and factor matrices Ak shows whether
the original data is well-recovered by dNTF
.
recX <- recTensor(out_dNTF2$S, out_dNTF2$A)
layout(t(1:2))
plotTensor3D(X2)
plotTensor3D(recX, thr=0)
The histograms of Aks show that Ak looks binary.
layout(t(1:3))
hist(out_dNTF2$A[[1]], main="A1", breaks=100)
hist(out_dNTF2$A[[2]], main="A2", breaks=100)
hist(out_dNTF2$A[[3]], main="A3", breaks=100)
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] nnTensor_1.3.0 fields_16.3 viridisLite_0.4.2 spam_2.11-1
## [5] dcTensor_1.3.0 rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 jsonlite_1.8.9 compiler_4.4.2 maps_3.4.2.1
## [5] Rcpp_1.0.14 plot3D_1.4.1 tagcloud_0.6 jquerylib_0.1.4
## [9] scales_1.3.0 yaml_2.3.10 fastmap_1.2.0 ggplot2_3.5.1
## [13] R6_2.5.1 tcltk_4.4.2 knitr_1.49 MASS_7.3-64
## [17] dotCall64_1.2 misc3d_0.9-1 tibble_3.2.1 maketools_1.3.1
## [21] munsell_0.5.1 pillar_1.10.1 bslib_0.9.0 RColorBrewer_1.1-3
## [25] rlang_1.1.5 cachem_1.1.0 xfun_0.50 sass_0.4.9
## [29] sys_3.4.3 cli_3.6.3 magrittr_2.0.3 digest_0.6.37
## [33] grid_4.4.2 rTensor_1.4.8 lifecycle_1.0.4 vctrs_0.6.5
## [37] evaluate_1.0.3 glue_1.8.0 buildtools_1.0.0 colorspace_2.1-1
## [41] pkgconfig_2.0.3 tools_4.4.2 htmltools_0.5.8.1