cossio / restrictedboltzmannmachines.jl Goto Github PK

View Code? Open in Web Editor NEW

14.0 4.0 3.0 531.79 MB

Train and sample Restricted Boltzmann machines in Julia

License: MIT License

Julia 99.44% Shell 0.56%

julia rbm machine-learning

restrictedboltzmannmachines.jl's Introduction

RestrictedBoltzmannMachines Julia package

Train and sample Restricted Boltzmann machines in Julia.

Installation

This package is registered. Install with:

import Pkg
Pkg.add("RestrictedBoltzmannMachines")

This package does not export any symbols. Since the name RestrictedBoltzmannMachines is long, it can be imported as:

import RestrictedBoltzmannMachines as RBMs

Related packages

Use RBMs on the GPU (CUDA):

https://github.com/cossio/CudaRBMs.jl

Centered and standardized RBMs:

Adversarially constrained RBMs:

https://github.com/cossio/AdvRBMs.jl

Save RBMs to HDF5 files:

https://github.com/cossio/RestrictedBoltzmannMachinesHDF5.jl

Stacked tempering:

https://github.com/2024stacktemperingrbm/StackedTempering.jl

Citation

If you use this package in a publication, please cite:

Jorge Fernandez-de-Cossio-Diaz, Simona Cocco, and Remi Monasson. "Disentangling representations in Restricted Boltzmann Machines without adversaries." Physical Review X 13, 021003 (2023).

Or you can use the included CITATION.bib.

restrictedboltzmannmachines.jl's People

Contributors

Stargazers

Watchers

Forkers

chronum94 agiorlandino gualtfor

restrictedboltzmannmachines.jl's Issues

pReLU eta must be between -1 and 1

We can enforce this constraint by parameterizing eta, for example:

eta = tanh(xi)

where now xi is unbounded.

We can use something faster than tanh. Here are some examples:

This suggests that the faster alternative is:

eta = xi / (1 + abs(xi))

Zero hidden units

Many ops fail with zero hidden units, even though they are well-defined mathematically.

julia> reshape(randn(0,4),0,:)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ ./int.jl:284 [inlined]
 [2] divrem
   @ ./div.jl:162 [inlined]
 [3] divrem
   @ ./div.jl:158 [inlined]
 [4] _reshape_uncolon
   @ ./reshapedarray.jl:127 [inlined]
 [5] reshape(parent::Matrix{Float64}, dims::Tuple{Int64, Colon})
   @ Base ./reshapedarray.jl:118
 [6] reshape(::Matrix{Float64}, ::Int64, ::Colon)
   @ Base ./reshapedarray.jl:117
 [7] top-level scope
   @ REPL[1]:1

log_pseudolikelihood for GaussianRBM

has the GaussianRBM been tested with MNIST data the same way as the MNIST example for BinaryRBM?

When I created a notebook to run a test in the exact same fashion as the example code, it blows up on
# println("log(PL) = ", mean(@time log_pseudolikelihood(rbm, train_x)))

and indeed, in the pseudolikelihood.jl file, there is no signature for
substitution_matrix_sites()
to accept rbm::RBM{<:Gaussian}

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

BinaryRBM gets very noisy fantasy particles

I ran through the MNIST BinaryRBM example with no modifications.

The output fantasy particles are noisy and lack variability.

Then I checked the output in the MNIST example and see that the same effects are there too. Noisy and no variability.

Next, I tried the fast_pcd, which is supposedly an improved sampling method. But it's even worse

Is there something that can be done to achieve better results? This is really not very useful as is.

Spin sampling

Here are two ways of sampling spins.

using LogExpFunctions, Random

function _sample_1(θ)
    u = rand(eltype(θ), size(θ))
    return ifelse.(u .* (1 .+ exp.(-2θ)) .≤ 1, Int8(1), Int8(-1))
end

function _sample_2(θ)
    u = randexp(eltype(θ), size(θ))
    return ifelse.(2θ .≥ u .- log1mexp.(u), Int8(1), Int8(-1))
end

and their benchmarks

So implement the first.

BitArray are slow

julia> @benchmark convert(Array{Float32}, b) setup=(b=Array(bitrand(28,28,128));)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  30.867 μs … 591.739 μs  ┊ GC (min … max): 0.00% … 74.02%
 Time  (median):     35.097 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   38.258 μs ±  31.908 μs  ┊ GC (mean ± σ):  5.89% ±  6.67%

        ▄██▇▅▄▅▄▂▁                                              
  ▂▁▂▃▃▇██████████▇▅▄▃▃▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  30.9 μs         Histogram: frequency by time         53.9 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.

julia> @benchmark convert(Array{Float32}, b) setup=(b=bitrand(28,28,128);)
BenchmarkTools.Trial: 8586 samples with 1 evaluation.
 Range (min … max):  515.307 μs …  1.291 ms  ┊ GC (min … max): 0.00% … 40.74%
 Time  (median):     556.879 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   576.900 μs ± 66.623 μs  ┊ GC (mean ± σ):  0.63% ±  3.89%

  ▅▄ ▆█▄▂▆▅▃▁▅▃▂ ▄▃▂ ▂▄▂▁ ▁▄▂▂▁                                ▂
  ██▇███████████▇███▇████▇█████▇█▇▇▇▇▇▆▆▆▇▅▅▅▆▇▇▄▄▄▆▆▆▅▅▃▅▄▄▄▃ █
  515 μs        Histogram: log(frequency) by time       850 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.

julia> @benchmark Float32.(b) setup=(b=bitrand(28,28,128);)
BenchmarkTools.Trial: 6760 samples with 1 evaluation.
 Range (min … max):  671.621 μs …  1.171 ms  ┊ GC (min … max): 0.00% … 25.08%
 Time  (median):     729.683 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   732.210 μs ± 38.635 μs  ┊ GC (mean ± σ):  0.35% ±  2.67%

       ▁▆▄▃▁▁ ▆█▆▅▃   ▂▁                                       ▁
  ▆▄▃▁▃██████▇██████▇▇███▇▆▄▄▅▆▆▄▃▁▄▁▁▁▁▁▁▁▁▃▇▇█▆▅▅▃▄▅▄▆▄▅▄▄▁▄ █
  672 μs        Histogram: log(frequency) by time       933 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.

julia> @benchmark Float32.(b) setup=(b=Array(bitrand(28,28,128));)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  104.868 μs … 715.391 μs  ┊ GC (min … max): 0.00% … 62.31%
 Time  (median):     109.923 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   116.574 μs ±  37.904 μs  ┊ GC (mean ± σ):  2.04% ±  5.74%

  ▄█▅▅▂▂▁▁ ▃▁                                                   ▁
  ████████████▇▆▆▆▆▆▅▆▄▆▆▆▅▆▆▅▅▅▅▁▅▅▄▄▅▁▁▃▄▄▁▁▁▁▁▁▁▁▁▄▁▁▅▆▅▆▅▅▅ █
  105 μs        Histogram: log(frequency) by time        267 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.

GaussianBinaryRBM gets very similar fantasy particles

I created my own GaussianBinaryRBM

function GaussianBinaryRBM(sz::Union{Int,Dims}, szb::Union{Int,Dims})
    b = zeros(Float64, szb...)
    w = zeros(Float64, sz..., szb)
    return RBM(Gaussian(Float64, sz), Binary(; θ = b), w)
end

and fed real valued MNIST data into it.
after training and generate fantasy particles by pcd, I get synthesized characters back, in addition to being noisy, there is no variability in the output images.

is there simple reason why this is?

example code for testing out GaussianRBM

I want to try GaussianRBM, but there isn't any examples code. The GaussianRBM constructor starts with the mean and std of the input layer. These parameters afaik are learned parameters during model training. How can they be fed during the instantiation of the model? unless you mean some random initial value.

if so, what random value should I enter?

CUDA issue

using CUDA
import RestrictedBoltzmannMachines as RBMs

rbmgpu = RBMs.BinaryRBM(CUDA.randn(N), CUDA.randn(M), CUDA.randn(N,M))
vgpu = CUDA.CuArray{Float32}(bitrand(N,B))

gives the following error:

julia> RBMs.sample_v_from_v(rbmgpu, vgpu)
ERROR: GPU compilation of kernel broadcast_kernel(CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, which is not isbits:
  .args is of type Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}} which is not isbits.
    .2 is of type Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}} which is not isbits.
      .x is of type Matrix{Float32} which is not isbits.


Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/validation.jl:71
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:385 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/8mHel/src/TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:384 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:332
  [7] #260
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:325 [inlined]
  [8] JuliaContext(f::CUDA.var"#260#261"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#broadcast_kernel#17", Tuple{CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64}}}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:74
  [9] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:324
 [10] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/cache.jl:90
 [11] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:297
 [12] cufunction
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:291 [inlined]
 [13] macro expansion
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:102 [inlined]
 [14] #launch_heuristic#284
    @ ~/.julia/packages/CUDA/Uurn4/src/gpuarrays.jl:17 [inlined]
 [15] _copyto!
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:73 [inlined]
 [16] copyto!
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:56 [inlined]
 [17] copy
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:47 [inlined]
 [18] materialize
    @ ./broadcast.jl:860 [inlined]
 [19] transfer_sample(layer::RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, inputs::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/layers/binary.jl:30
 [20] sample_h_from_v
    @ ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:102 [inlined]
 [21] sample_v_from_v_once
    @ ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:144 [inlined]
 [22] sample_v_from_v(rbm::RestrictedBoltzmannMachines.RBM{RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, v::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}; steps::Int64)
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:124
 [23] sample_v_from_v(rbm::RestrictedBoltzmannMachines.RBM{RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, v::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:122
 [24] top-level scope
    @ REPL[76]:1
 [25] top-level scope
    @ ~/.julia/packages/CUDA/Uurn4/src/initialization.jl:52

Something about this line:

RestrictedBoltzmannMachines.jl/src/layers/binary.jl

Line 30 in 2dd9118

return binary_rand.(θ, u)

that CUDA doesn't like??

Doc build failing because of PyCall issue

PL comparisons to PGM with smaller rtol

RestrictedBoltzmannMachines.jl/test/compare_to_pgm/pgm.jl

Line 105 in bfc3702

# TODO: see if we can decrease rtol

Issue with Int8 overflow

This line:

RestrictedBoltzmannMachines.jl/src/train/initialization.jl

Line 109 in 9456632

d = dot(data, data) / size(data, ndims(data))

Can result in integer overflow when data is of integer type, particularly for Spin units which are of type Int8. It then will raise an error here:

RestrictedBoltzmannMachines.jl/src/train/initialization.jl

Line 116 in 9456632

weights(rbm) .*= λ / √(d + ϵ)

because it will try to take the sqrt of a negative number.

CUDA doesn't support falses

This line:

RestrictedBoltzmannMachines.jl/src/train/pcd.jl

Line 29 in c6622d4

vm = sample_from_inputs(rbm.visible, falses(size(rbm.visible)..., batchsize)),

gives errors with CUDA, because BitArray are not supported on the GPU (the falses is the culprit). Here is an example:

julia> cu(randn(5,5)) .+ falses(5,5)
ERROR: GPU compilation of kernel #broadcast_kernel#17(CUDA.CuKernelContext, CuDeviceMatrix{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, which is not isbits:
  .args is of type Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}} which is not isbits.
    .2 is of type Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}} which is not isbits.
      .x is of type BitMatrix which is not isbits.
        .chunks is of type Vector{UInt64} which is not isbits.

where is RestrictedBoltzmannMachines.fpcd! — Method

in the document page
https://docs.juliahub.com/RestrictedBoltzmannMachines/ygJtt/0.21.1/autodocs/

There is
RestrictedBoltzmannMachines.fpcd! — Method

listed. but nowhere to be found in the code. Does it exist? maybe in an older version? Isn't it needed to reliably sample synthesized fantasy particles from random input?

activations conversions

RestrictedBoltzmannMachines.jl/src/rbm.jl

Lines 310 to 311 in f71a30c

    
           hflat = flatten(hidden(rbm), h) 
        
           vflat = activations_convert_maybe(hflat, flatten(visible(rbm), v))

For binary units, that converts to bool/Int and will not hit blas in the product vflat * hflat'. This is not good because that product reduces over batches and should be faster with BLAS.

In general, activations_convert should probably convert to the weight eltype always. ... and to support CUDA perhaps use the array type of the weigths.

wts with eltype of rbm.w ?

RestrictedBoltzmannMachines.jl/src/rbm.jl

Line 317 in 73bb5bf

∂wflat = -vflat * Diagonal(vec(wts)) * hflat' / sum(wts)

Should we convert wts here to the eltype of rbm.w ? Something like this:

wts_diag = with_eltypeof(rbm.w, Diagonal(vec(wts)))
∂wflat = -vflat * wts_diag * hflat' / sum(wts)

dReLU energy at zero

If the unit has zero activation, the dReLU energy should give zero.

Currently, since the dReLU energy is implemented by calling the ReLU energy, it gives Inf.

This should be corrected.

Edit: Actually this works fine, because the ReLU energy at zero is zero. The ReLU energy gives Inf only if x < 0.

	hflat = flatten(hidden(rbm), h)
	vflat = activations_convert_maybe(hflat, flatten(visible(rbm), v))