Code Monkey home page Code Monkey logo

restrictedboltzmannmachines.jl's Introduction

RestrictedBoltzmannMachines Julia package

Train and sample Restricted Boltzmann machines in Julia.

Installation

This package is registered. Install with:

import Pkg
Pkg.add("RestrictedBoltzmannMachines")

This package does not export any symbols. Since the name RestrictedBoltzmannMachines is long, it can be imported as:

import RestrictedBoltzmannMachines as RBMs

Related packages

Use RBMs on the GPU (CUDA):

Centered and standardized RBMs:

Adversarially constrained RBMs:

Save RBMs to HDF5 files:

Stacked tempering:

Citation

If you use this package in a publication, please cite:

  • Jorge Fernandez-de-Cossio-Diaz, Simona Cocco, and Remi Monasson. "Disentangling representations in Restricted Boltzmann Machines without adversaries." Physical Review X 13, 021003 (2023).

Or you can use the included CITATION.bib.

restrictedboltzmannmachines.jl's People

Contributors

cossio avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

restrictedboltzmannmachines.jl's Issues

pReLU eta must be between -1 and 1

We can enforce this constraint by parameterizing eta, for example:

eta = tanh(xi)

where now xi is unbounded.

We can use something faster than tanh. Here are some examples:

image

This suggests that the faster alternative is:

eta = xi / (1 + abs(xi))

Zero hidden units

Many ops fail with zero hidden units, even though they are well-defined mathematically.

julia> reshape(randn(0,4),0,:)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ ./int.jl:284 [inlined]
 [2] divrem
   @ ./div.jl:162 [inlined]
 [3] divrem
   @ ./div.jl:158 [inlined]
 [4] _reshape_uncolon
   @ ./reshapedarray.jl:127 [inlined]
 [5] reshape(parent::Matrix{Float64}, dims::Tuple{Int64, Colon})
   @ Base ./reshapedarray.jl:118
 [6] reshape(::Matrix{Float64}, ::Int64, ::Colon)
   @ Base ./reshapedarray.jl:117
 [7] top-level scope
   @ REPL[1]:1

log_pseudolikelihood for GaussianRBM

has the GaussianRBM been tested with MNIST data the same way as the MNIST example for BinaryRBM?

When I created a notebook to run a test in the exact same fashion as the example code, it blows up on
# println("log(PL) = ", mean(@time log_pseudolikelihood(rbm, train_x)))

and indeed, in the pseudolikelihood.jl file, there is no signature for
substitution_matrix_sites()
to accept rbm::RBM{<:Gaussian}

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

BinaryRBM gets very noisy fantasy particles

I ran through the MNIST BinaryRBM example with no modifications.

The output fantasy particles are noisy and lack variability.

Then I checked the output in the MNIST example and see that the same effects are there too. Noisy and no variability.
pcd_bin

Next, I tried the fast_pcd, which is supposedly an improved sampling method. But it's even worse
fast_pcd

Is there something that can be done to achieve better results? This is really not very useful as is.

Spin sampling

Here are two ways of sampling spins.

using LogExpFunctions, Random

function _sample_1(θ)
    u = rand(eltype(θ), size(θ))
    return ifelse.(u .* (1 .+ exp.(-2θ)) .≤ 1, Int8(1), Int8(-1))
end

function _sample_2(θ)
    u = randexp(eltype(θ), size(θ))
    return ifelse.(2θ .≥ u .- log1mexp.(u), Int8(1), Int8(-1))
end

and their benchmarks

image

So implement the first.

BitArray are slow

julia> @benchmark convert(Array{Float32}, b) setup=(b=Array(bitrand(28,28,128));)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  30.867 μs  591.739 μs  ┊ GC (min  max): 0.00%  74.02%
 Time  (median):     35.097 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   38.258 μs ±  31.908 μs  ┊ GC (mean ± σ):  5.89% ±  6.67%

        ▄██▇▅▄▅▄▂▁                                              
  ▂▁▂▃▃▇██████████▇▅▄▃▃▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  30.9 μs         Histogram: frequency by time         53.9 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.
julia> @benchmark convert(Array{Float32}, b) setup=(b=bitrand(28,28,128);)
BenchmarkTools.Trial: 8586 samples with 1 evaluation.
 Range (min  max):  515.307 μs   1.291 ms  ┊ GC (min  max): 0.00%  40.74%
 Time  (median):     556.879 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   576.900 μs ± 66.623 μs  ┊ GC (mean ± σ):  0.63% ±  3.89%

  ▅▄ ▆█▄▂▆▅▃▁▅▃▂ ▄▃▂ ▂▄▂▁ ▁▄▂▂▁                                ▂
  ██▇███████████▇███▇████▇█████▇█▇▇▇▇▇▆▆▆▇▅▅▅▆▇▇▄▄▄▆▆▆▅▅▃▅▄▄▄▃ █
  515 μs        Histogram: log(frequency) by time       850 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.
julia> @benchmark Float32.(b) setup=(b=bitrand(28,28,128);)
BenchmarkTools.Trial: 6760 samples with 1 evaluation.
 Range (min  max):  671.621 μs   1.171 ms  ┊ GC (min  max): 0.00%  25.08%
 Time  (median):     729.683 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   732.210 μs ± 38.635 μs  ┊ GC (mean ± σ):  0.35% ±  2.67%

       ▁▆▄▃▁▁ ▆█▆▅▃   ▂▁                                       ▁
  ▆▄▃▁▃██████▇██████▇▇███▇▆▄▄▅▆▆▄▃▁▄▁▁▁▁▁▁▁▁▃▇▇█▆▅▅▃▄▅▄▆▄▅▄▄▁▄ █
  672 μs        Histogram: log(frequency) by time       933 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.
julia> @benchmark Float32.(b) setup=(b=Array(bitrand(28,28,128));)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  104.868 μs  715.391 μs  ┊ GC (min  max): 0.00%  62.31%
 Time  (median):     109.923 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   116.574 μs ±  37.904 μs  ┊ GC (mean ± σ):  2.04% ±  5.74%

  ▄█▅▅▂▂▁▁ ▃▁                                                   ▁
  ████████████▇▆▆▆▆▆▅▆▄▆▆▆▅▆▆▅▅▅▅▁▅▅▄▄▅▁▁▃▄▄▁▁▁▁▁▁▁▁▁▄▁▁▅▆▅▆▅▅▅ █
  105 μs        Histogram: log(frequency) by time        267 μs <

 Memory estimate: 392.06 KiB, allocs estimate: 2.

GaussianBinaryRBM gets very similar fantasy particles

I created my own GaussianBinaryRBM

function GaussianBinaryRBM(sz::Union{Int,Dims}, szb::Union{Int,Dims})
    b = zeros(Float64, szb...)
    w = zeros(Float64, sz..., szb)
    return RBM(Gaussian(Float64, sz), Binary(; θ = b), w)
end

and fed real valued MNIST data into it.
after training and generate fantasy particles by pcd, I get synthesized characters back, in addition to being noisy, there is no variability in the output images.
gaussian-binary-fant

is there simple reason why this is?

example code for testing out GaussianRBM

I want to try GaussianRBM, but there isn't any examples code. The GaussianRBM constructor starts with the mean and std of the input layer. These parameters afaik are learned parameters during model training. How can they be fed during the instantiation of the model? unless you mean some random initial value.

if so, what random value should I enter?

CUDA issue

using CUDA
import RestrictedBoltzmannMachines as RBMs

rbmgpu = RBMs.BinaryRBM(CUDA.randn(N), CUDA.randn(M), CUDA.randn(N,M))
vgpu = CUDA.CuArray{Float32}(bitrand(N,B))

gives the following error:

julia> RBMs.sample_v_from_v(rbmgpu, vgpu)
ERROR: GPU compilation of kernel broadcast_kernel(CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, which is not isbits:
  .args is of type Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}} which is not isbits.
    .2 is of type Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}} which is not isbits.
      .x is of type Matrix{Float32} which is not isbits.


Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/validation.jl:71
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:385 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/8mHel/src/TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:384 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:332
  [7] #260
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:325 [inlined]
  [8] JuliaContext(f::CUDA.var"#260#261"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#broadcast_kernel#17", Tuple{CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64}}}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:74
  [9] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:324
 [10] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/cache.jl:90
 [11] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceMatrix{Bool, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(RestrictedBoltzmannMachines.binary_rand), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:297
 [12] cufunction
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:291 [inlined]
 [13] macro expansion
    @ ~/.julia/packages/CUDA/Uurn4/src/compiler/execution.jl:102 [inlined]
 [14] #launch_heuristic#284
    @ ~/.julia/packages/CUDA/Uurn4/src/gpuarrays.jl:17 [inlined]
 [15] _copyto!
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:73 [inlined]
 [16] copyto!
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:56 [inlined]
 [17] copy
    @ ~/.julia/packages/GPUArrays/Zecv7/src/host/broadcast.jl:47 [inlined]
 [18] materialize
    @ ./broadcast.jl:860 [inlined]
 [19] transfer_sample(layer::RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, inputs::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/layers/binary.jl:30
 [20] sample_h_from_v
    @ ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:102 [inlined]
 [21] sample_v_from_v_once
    @ ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:144 [inlined]
 [22] sample_v_from_v(rbm::RestrictedBoltzmannMachines.RBM{RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, v::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}; steps::Int64)
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:124
 [23] sample_v_from_v(rbm::RestrictedBoltzmannMachines.RBM{RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, RestrictedBoltzmannMachines.Binary{1, Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, v::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ RestrictedBoltzmannMachines ~/.julia/packages/RestrictedBoltzmannMachines/9myqX/src/rbm.jl:122
 [24] top-level scope
    @ REPL[76]:1
 [25] top-level scope
    @ ~/.julia/packages/CUDA/Uurn4/src/initialization.jl:52

Something about this line:

return binary_rand.(θ, u)

that CUDA doesn't like??

CUDA doesn't support falses

This line:

vm = sample_from_inputs(rbm.visible, falses(size(rbm.visible)..., batchsize)),

gives errors with CUDA, because BitArray are not supported on the GPU (the falses is the culprit). Here is an example:

julia> cu(randn(5,5)) .+ falses(5,5)
ERROR: GPU compilation of kernel #broadcast_kernel#17(CUDA.CuKernelContext, CuDeviceMatrix{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, which is not isbits:
  .args is of type Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}}} which is not isbits.
    .2 is of type Base.Broadcast.Extruded{BitMatrix, Tuple{Bool, Bool}, Tuple{Int64, Int64}} which is not isbits.
      .x is of type BitMatrix which is not isbits.
        .chunks is of type Vector{UInt64} which is not isbits.

activations conversions

hflat = flatten(hidden(rbm), h)
vflat = activations_convert_maybe(hflat, flatten(visible(rbm), v))

For binary units, that converts to bool/Int and will not hit blas in the product vflat * hflat'. This is not good because that product reduces over batches and should be faster with BLAS.

In general, activations_convert should probably convert to the weight eltype always. ... and to support CUDA perhaps use the array type of the weigths.

dReLU energy at zero

If the unit has zero activation, the dReLU energy should give zero.

Currently, since the dReLU energy is implemented by calling the ReLU energy, it gives Inf.

This should be corrected.

Edit: Actually this works fine, because the ReLU energy at zero is zero. The ReLU energy gives Inf only if x < 0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.