attractivechaos / kann Goto Github PK

View Code? Open in Web Editor NEW

669.0 669.0 117.0 780 KB

A lightweight C library for artificial neural networks

License: Other

Makefile 1.66% C 98.34%

deep-learning neural-network

kann's People

Contributors

Stargazers

Watchers

Forkers

blackball rperrot zeyuan1987 liyc7711 yibit benjamesbabala templeblock unfixab1e chagge hades210 iamsile poseidon1214 timwee secp8x32 rosvill guangxiangli rzel avilella primulinus neuralnetworkingtechnologies raphael-s-norwitz pathosheeman beardmix bssrdf hbcbh1999 chenkovsky robertomalatesta cliff-exception gareins refaqtor zhengfish skyformat99 charloco greenfigo2015 venitech sy-lucky rafat wangz2012 tuturis startprj zhaoforever mofanv blackline1 sifatron 9crk hopedata huntershuai pauleric eversee22 akjadoon fengggli zhouxf53 nedatavakoli slunski c00lrain girubagar step9 bkmgit gerbenvoshol zcy618 1shekhar wangdi2014 futurescv abeanater uastw-embsys roozbehsanaei kkmonster mrbigdog blakewford alpine9000 massimo-nocentini yuxuanyuan grahamgower cseuk6 mostefaben alexandreofbh davidoster clayne alperyilmaz smartechru nanqiai levalicious pellmell1 yuriy-chumak anrl yangboyd 0xyd john1726 lkampoli justanotherdot ichramm augustrush 5l1v3r1 mrinalghosh dariosanfilippo youyongquan nitish-garikoti panthereum samuelmarks stanplatinum

kann's Issues

How to extract outputs from a hidden layer of autoencoder?

Adding CUDA support ?

Hi,

Is there any plan to add CUDA support in the near future ? It will be very useful if we want to train some medium size network. It will also be very attractive for platforms like Tegra TK1, etc. Libraries like caffe and mxnet rely on too many libraries. Sometimes it will consume too much time to resolve these libraries conflicts during installation.

Custom loss function example using kad_op functions

Hi,

I'm trying to implement a custom loss function with a simple MLP.
Is there an example of using the kad_op functions to accomplish this so that I benefit from automatic differentiation?
I don't want to explicitly write the backward computation as is the case for the currently implemented loss functions (mse, ce, etc).

Or is this approach not feasible (for memory consumption reasons) as it will require the computation and storage of the gradients for each operation in the loss function?

I'd greatly appreciate any help/feedback/example!

Thanks!

What does "va_start" mean in kann ?

IN kann.c line 521:
va_start(ap, n_d); for (i = 0; i < n_d; ++i) d[i] = va_arg(ap, int); va_end(ap)
my question is what is this uesed for?

kann_layer_linear

I have not been able to find kann_layer_linear definition in the source code.

It is from the example (A complete example) given on the page.

t = kad_relu(kann_layer_linear(t, 64));

It's not that I need it right now, but, it is the first thing I tried running.

Model inference on ARM M4F

Hi there.
Great work on the project! I have made successful progress with mac os. But I was wondering if the trained model, say minst-cnn.kan, could be transferred to ARM M4F chip for inference? There would definitely be data input and output processing. But apart from that, is it possible to use the trained model on ARM? Thanks in advance!

Image classification of cats, dogs, and pandas

I am attempting to use KANN to classify cats, dogs, and pandas. I have the data pre-processed such that it reads every image in as RGB byte, scales to float, and resizes to 32x32 (3 channel still). I store the image and labels as float **x, float **y, where the dimension of x is [nsamples][32x32x3] (formatted as a flattened array with rows = ncols * 3 laid out as RGB per pixel) and y is [nsamples][3] (for the 3 classes of cat/dog/panda). I split my data into a 75% train and send it into a modified version of the "Complete Example" provided:

int train_kann(float **x, int nrows, int ncols, int nbands, float **y, int nclasses, int n_samples)
{
	int max_bit, i;
	kad_node_t *t;
	kann_t *ann;

	max_bit = nrows * ncols * nbands;

	// construct an MLP with one hidden layers
	t = kann_layer_input(max_bit);
	t = kad_relu(kann_layer_dense(t, 64));
	t = kann_layer_cost(t, nclasses, KANN_C_CEM); // output uses 1-hot encoding
	ann = kann_new(t, 0);

	// train
	kann_train_fnn1(ann, 0.001f, 64, 50, 10, 0.1f, n_samples, x, y);

	return 0;
}

However I am getting some strange output from kann_train_fnn1. It is not reporting the class error in training or validation, so I am getting n_train_base == 0 and n_val == 0 (meaning no class error?).

epoch: 1; training cost: 13.2655; validation cost: 13.8155
epoch: 2; training cost: 13.8112; validation cost: 13.8155
epoch: 3; training cost: 13.8155; validation cost: 13.8155
epoch: 4; training cost: 13.8155; validation cost: 13.8155
(repeats these values for the remaining epochs)

I have a feeling this is an issue of how I set up my data and labels. Any help would be greatly appreciated.

Can we define the neuron number for each layer?

Hi,

From the code for MLP, I can see that I can set the hidden layer but all the neuron numbers for each layer are the same. Are any methods to define a different number of neurons for different layers?

Cheers,
Travis

Support for batch normalisation

Are there any plans to support batch normalisation?

Convolutional recurrent neural network

I want to combine convolutional layer with recurrent one. This code is based on #19:

    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kad_feed(3, 1, 1, 28), t->ext_flag |= KANN_F_IN;
    t = kad_relu(kann_layer_conv1d(t, 32, 3, 1, 0)); // 3 kernel; 1 stride; 0 padding
    t = kann_layer_dropout(t, dropout);
    t = kad_max1d(t, 2, 2, 0); // 2 kernel; 2 stride; 0 padding
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_select(1, &t, -1);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
    kad_print_graph(stdout, ann->n, ann->v);

It works:

./mnist-crnn -i mnist-crnn.kan kann-data/mnist-test-x.knd | kann-data/mnist-eval.pl
Error rate: 1.19%

Questions:

i stumbled across same problem #6 at first, then i replaced kann_layer_input to kad_feed(3, 1, 1, 28) to make it work, but numbers 1, 1 still looks like magic to me... Are they correct ?
does backprop work correctly for conv1d on unrolled rnn ?

Whole code:

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "kann_extra/kann_data.h"
#include "kann.h"

typedef struct {
  int n_in, n_out, ulen, n;
  float **x, **y;
} train_data;

static void train(kann_t *ann, train_data *d, float lr, int mini_size, int max_epoch, const char *fn, int n_threads)
{
  float **x, **y, *r, best_cost = 1e30f;
  int epoch, j, n_var, *shuf;
  kann_t *ua;

  n_var = kann_size_var(ann);
  r = (float*)calloc(n_var, sizeof(float));
  x = (float**)malloc(d->ulen * sizeof(float*));
  y = (float**)malloc(1 * sizeof(float*));
  for (j = 0; j < d->ulen; ++j) {
    x[j] = (float*)calloc(mini_size * d->n_in, sizeof(float));
  }
  y[0] = (float*)calloc(mini_size * d->n_out, sizeof(float));
  shuf = (int*)calloc(d->n, sizeof(int));

  ua = kann_unroll(ann, d->ulen);
  kann_set_batch_size(ua, mini_size);
  kann_mt(ua, n_threads, mini_size);
  kann_feed_bind(ua, KANN_F_IN,    0, x);
  kann_feed_bind(ua, KANN_F_TRUTH, 0, y);
  kann_switch(ua, 1);
  for (epoch = 0; epoch < max_epoch; ++epoch) {
    kann_shuffle(d->n, shuf);
    double cost = 0.0;
    int tot = 0, tot_base = 0, n_cerr = 0;
    for (j = 0; j < d->n - mini_size; j += mini_size) {
      int b, k;
      for (b = 0; b < mini_size; ++b) {
        int s = shuf[j + b];
        for (k = 0; k < d->ulen; ++k) {
          memcpy(&x[k][b * d->n_in], &d->x[s][k * d->n_in], d->n_in * sizeof(float));
        }
        memcpy(&y[0][b * d->n_out], d->y[s], d->n_out * sizeof(float));
      }
      cost += kann_cost(ua, 0, 1) * d->ulen * mini_size;
      n_cerr += kann_class_error(ua, &k);
      tot_base += k;
      //kad_check_grad(ua->n, ua->v, ua->n-1);
      kann_RMSprop(n_var, lr, 0, 0.9f, ua->g, ua->x, r);
      tot += d->ulen * mini_size;
    }
    if (cost < best_cost) {
      best_cost = cost;
      if (fn) kann_save(fn, ann);
    }
    fprintf(stderr, "epoch: %d; cost: %g (class error: %.2f%%)\n", epoch+1, cost / tot, 100.0f * n_cerr / tot_base);
  }

  kann_delete_unrolled(ua);

  for (j = 0; j < d->ulen; ++j) {
    free(x[j]);
  }
  free(y[0]); free(y); free(x); free(r); free(shuf);
}

static train_data* create_train_data(kann_t *ann, kann_data_t *x, kann_data_t *y)
{
  train_data *d;
  d = (train_data*)malloc(sizeof(*d));
  assert(d);
  assert(x->n_row == y->n_row);
  d->x = x->x;
  d->y = y->x;
  d->ulen = 28; // 28x28
  d->n = x->n_row;
  d->n_in = kann_dim_in(ann);
  d->n_out = kann_dim_out(ann);
  return d;
}

int main(int argc, char *argv[])
{
  kann_t *ann;
  kann_data_t *x, *y;
  char *fn_in = 0, *fn_out = 0;
  int c, i, mini_size = 64, max_epoch = 50, seed = 84, n_h_layers = 1, n_h_neurons = 64, norm = 1, n_h_flt = 32, n_threads = 1;
  float lr = 0.001f, dropout = 0.2f;

  while ((c = getopt(argc, argv, "i:o:m:l:n:d:s:t:N")) >= 0) {
    if (c == 'i') fn_in = optarg;
    else if (c == 'o') fn_out = optarg;
    else if (c == 'm') max_epoch = atoi(optarg);
    else if (c == 'l') n_h_layers = atoi(optarg);
    else if (c == 'n') n_h_neurons = atoi(optarg);
    else if (c == 'd') dropout = atof(optarg);
    else if (c == 's') seed = atoi(optarg);
    else if (c == 't') n_threads = atoi(optarg);
    else if (c == 'N') norm = 0;
  }

  if (argc - optind == 0 || (argc - optind == 1 && fn_in == 0)) {
    FILE *fp = stdout;
    fprintf(fp, "Usage: mnist-cnn [-i model] [-o model] [-t nThreads] <x.knd> [y.knd]\n");
    return 1;
  }

  kad_trap_fe();
  kann_srand(seed);
  if (fn_in) {
    ann = kann_load(fn_in);
  } else {
    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kad_feed(3, 1, 1, 28), t->ext_flag |= KANN_F_IN;
    t = kad_relu(kann_layer_conv1d(t, 32, 3, 1, 0)); // 3 kernel; 1 stride; 0 padding
    t = kann_layer_dropout(t, dropout);
    t = kad_max1d(t, 2, 2, 0); // 2 kernel; 2 stride; 0 padding
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_select(1, &t, -1);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
    kad_print_graph(stdout, ann->n, ann->v);
  }

  x = kann_data_read(argv[optind]);
  assert(x->n_col == 28 * 28);
  y = argc - optind >= 2? kann_data_read(argv[optind+1]) : 0;

  if (y) { // training
    assert(y->n_col == 10);
    if (n_threads > 1) kann_mt(ann, n_threads, mini_size);
    train_data *d;
    d = create_train_data(ann, x, y);
    train(ann, d, lr, mini_size, max_epoch, fn_out, n_threads);
    free(d);
    kann_data_free(y);
  } else { // applying
    int i, j, k, n_out;
    kann_switch(ann, 0);
    n_out = kann_dim_out(ann);
    assert(n_out == 10);
    for (i = 0; i < x->n_row; ++i) {
      const float *y;
      kann_rnn_start(ann);
      for(k = 0; k < 28; ++k) {
        float x1[28];
        memcpy(x1, &x->x[i][k * 28], sizeof(x1));
        y = kann_apply1(ann, x1);
      }
      if (x->rname) printf("%s\t", x->rname[i]);
      for (j = 0; j < n_out; ++j) {
        if (j) putchar('\t');
        printf("%.3g", y[j] + 1.0f - 1.0f);
      }
      putchar('\n');
      kann_rnn_end(ann);
    }
  }

  kann_data_free(x);
  kann_delete(ann);
  return 0;
}

Feature request: model conversion utility.

Dear Author: Can we have an ONNX <=> KANN conversion utility?

License Type

Hello,

I am looking for a lightweight and standalone framework for deep learning, and this one looks like it could match my needs.
What licensing type is covering the source code?
MIT? BSD-3?

Thanks,
Mathieu

Format of model file

I would like to use a model that is pre-trained on Keras or Tensorflow, and run it on kann.
I am trying to look for the file format that the weights needs to be saved in to load it from with kann.
Please advise.

Support for Leaky ReLU

Are there any plans to add a Leaky ReLU activation function?

resnet

I want apply it to resent, so I write

kad_node_t *basic_block(kad_node_t *x, int channel)
{
	kad_node_t *y = kad_relu(kann_layer_conv2d(x, channel, 3, 3, 1, 1, 1, 1));
	y = kann_layer_conv2d(y, channel, 3, 3, 1, 1, 1, 1);
	return kad_relu(kad_add(x, y));
}

but it will arise segmentation fault

Transfrom script for dataset

Hi,
I could download dataset you provided. Can I get your transformation script so that we can properly convert some dataset to load to kann? (There is no such script in the kann repository)

Thank you

Possible Conv1D and Max1D Issue

Hi there.
I am dealing with 1D signal and hence I have modified the mnist-cnn.c example and changed the model like shown below:

kad_node_t *t;
t = kann_layer_input(200);
t = kad_relu(kann_layer_conv1d(t, n_h_flt, 5, 1, 2));
t = kad_max1d(t, 2, 1, 1);
t = kad_relu(kann_layer_conv1d(t, n_h_flt, 5, 1, 2));
t = kad_max1d(t, 2, 1, 1);
t = kann_layer_dense(t, 200);
t = kad_relu(t);
t = kann_layer_dense(t, 100);
t = kad_relu(t);
t = kann_layer_dense(t, 50);
t = kad_relu(t);
ann = kann_new(kann_layer_cost(t, 2, KANN_C_CEB), 0);

I added the padding so as to keep the length. My input is 200 in length and output is a simple true/false.
However, when I compile and run training on this, the terminal immediately shows "Segmentation fault: 11." I believe the model to be correct, so I suspect the issue is from 1d-related function?

Thanks in advance!

Does RNN/LSTM/GRU in kann Support 2-dim input？

Hi, as mnist-cnn example show that the input can be a 28 * 28 graph , but the rnn-bit only support vector input such as [2,4].

Why isn't KANN scalable?

Hi,
I was wondering about the statement in the README - "KANN is not as scalable, but it is close in flexibility, has a much smaller code base and only depends on the standard C library.".

Why isn't KANN scalable and why isn't it it suitable for training deeper networks?

More examples (classify text)

Hi, basically I'm hoping someone can point me to (or create) an example of a text sentiment conv based classifier using KANN. Doesn't need to be exact just something I can use as a base to start from. Something like this example from keras, even just the core model code would give me a starting point...

embedding_dim = 100
model = Sequential()
model.add(layers.Embedding(vocab_size, embedding_dim, input_length=maxlen))
model.add(layers.Conv1D(128, 5, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(10, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()

(example from) https://realpython.com/python-keras-text-classification/

`// here is my guess... but it's pure guess work so may be nonsense :-)

kann_t *model_gen_classify(int n_h_flt, int n_h_fc)
{
int wordsize=10; // lets assume word embedding
int sentence=100; // length of each sentence to analyze
int wordgroup[]={3,4,5}; // group words in 3,4,5 word groups in conv layers.
float dropout = 0.2f;
kann_t *ann;
kad_node_t *t;
t = kad_feed(4, 1, 1, sentence, wordsize), t->ext_flag |= KANN_F_IN;
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[0], wordsize, 1, 1, 0, 0));
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[1], wordsize, 1, 1, 0, 0));
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[2], wordsize, 1, 1, 0, 0));
t = kann_layer_dropout(t, dropout);
t = kann_layer_dense(t, n_h_fc);
t = kad_relu(t);
ann = kann_new(kann_layer_cost(t,1, KANN_C_CEB), 0);
return ann;
}
`

mnist-cnn example fail assert on training

I'm not sure if I'm missing something but I tried the `mnist-cnn example in the same ways as the README, but the training fails an assert on line 51:

kann/examples/mnist-cnn.c

Line 51 in 94a68cd

assert(x->n_col == 28 * 28);

I printed x->n_col and the result is 0. I'm not sure if the problem is the code or in the data (I got the data from this repo as well, as stated in the examples readme file).

I tried removing the assert but it naturally just segfaults elsewhere.

The mlp example works fine, so I assume it isn't the data.

kann_class_error() is broken for KANN_C_CEB

kann_class_error() always returns 0 for networks with KANN_C_CEB cost.

the kann_layer_conv1d() is a temporary convolution?

I want use a conv1d function like tf.layers.conv1d() in TensofFlow. It the kann_layer_conv1d() is a suitable function？

Export network coefficients to text that human understandable

If a function that can print MLP to weights and bias in text file would be great.
As the function could help building a NN without extra overhead.

RNN classification example

When classify a sequence, we would like the network to have one output, instead of a sequence of output. According to 01user.md to classify a sequence kad_avg was mentioned. I tried this on mnist. It works but
i am not sure how to train such network. During training process we don't even know output values other then last one. In this line memcpy(&y[k][b * d->n_out], d->y[s], d->n_out * sizeof(float)); each y in sequence of output has same value d->y[s] which looks strange.

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "kann_extra/kann_data.h"
#include "kann.h"

typedef struct {
  int n_in, n_out, ulen, n;
  float **x, **y;
} train_data;

static void train(kann_t *ann, train_data *d, float lr, int mini_size, int max_epoch, const char *fn, int n_threads)
{
  float **x, **y, *r, best_cost = 1e30f;
  int epoch, j, n_var, *shuf;
  kann_t *ua;

  n_var = kann_size_var(ann);
  r = (float*)calloc(n_var, sizeof(float));
  x = (float**)malloc(d->ulen * sizeof(float*));
  y = (float**)malloc(d->ulen * sizeof(float*));
  for (j = 0; j < d->ulen; ++j) {
    x[j] = (float*)calloc(mini_size * d->n_in, sizeof(float));
    y[j] = (float*)calloc(mini_size * d->n_out, sizeof(float));
  }
  shuf = (int*)calloc(d->n, sizeof(int));

  ua = kann_unroll(ann, d->ulen);
  kann_set_batch_size(ua, mini_size);
  kann_mt(ua, n_threads, mini_size);
  kann_feed_bind(ua, KANN_F_IN,    0, x);
  kann_feed_bind(ua, KANN_F_TRUTH, 0, y);
  kann_switch(ua, 1);
  for (epoch = 0; epoch < max_epoch; ++epoch) {
    kann_shuffle(d->n, shuf);
    double cost = 0.0;
    int tot = 0, tot_base = 0, n_cerr = 0;
    for (j = 0; j < d->n - mini_size; j += mini_size) {
      int b, k;
      for (k = 0; k < d->ulen; ++k) {
        for (b = 0; b < mini_size; ++b) {
          int s = shuf[j + b];
          memcpy(&x[k][b * d->n_in], &d->x[s][k * d->n_in], d->n_in * sizeof(float));
          memcpy(&y[k][b * d->n_out], d->y[s], d->n_out * sizeof(float));
        }
      }
      cost += kann_cost(ua, 0, 1) * d->ulen * mini_size;
      n_cerr += kann_class_error(ua, &k);
      tot_base += k;
      //kad_check_grad(ua->n, ua->v, ua->n-1);
      kann_RMSprop(n_var, lr, 0, 0.9f, ua->g, ua->x, r);
      tot += d->ulen * mini_size;
    }
    if (cost < best_cost) {
      best_cost = cost;
      if (fn) kann_save(fn, ann);
    }
    fprintf(stderr, "epoch: %d; cost: %g (class error: %.2f%%)\n", epoch+1, cost / tot, 100.0f * n_cerr / tot_base);
  }

  kann_delete_unrolled(ua);

  for (j = 0; j < d->ulen; ++j) {
    free(y[j]); free(x[j]);
  }
  free(y); free(x); free(r); free(shuf);
}

static train_data* create_train_data(kann_t *ann, kann_data_t *x, kann_data_t *y)
{
  train_data *d;
  d = (train_data*)malloc(sizeof(*d));
  assert(d);
  assert(x->n_row == y->n_row);
  d->x = x->x;
  d->y = y->x;
  d->ulen = 28; // 28x28
  d->n = x->n_row;
  d->n_in = kann_dim_in(ann);
  d->n_out = kann_dim_out(ann);
  return d;
}

int main(int argc, char *argv[])
{
  kann_t *ann;
  kann_data_t *x, *y;
  char *fn_in = 0, *fn_out = 0;
  int c, i, mini_size = 64, max_epoch = 50, seed = 84, n_h_layers = 1, n_h_neurons = 64, norm = 1, n_threads = 1;
  float lr = 0.001f, dropout = 0.2f;

  while ((c = getopt(argc, argv, "i:o:m:l:n:d:s:t:N")) >= 0) {
    if (c == 'i') fn_in = optarg;
    else if (c == 'o') fn_out = optarg;
    else if (c == 'm') max_epoch = atoi(optarg);
    else if (c == 'l') n_h_layers = atoi(optarg);
    else if (c == 'n') n_h_neurons = atoi(optarg);
    else if (c == 'd') dropout = atof(optarg);
    else if (c == 's') seed = atoi(optarg);
    else if (c == 't') n_threads = atoi(optarg);
    else if (c == 'N') norm = 0;
  }

  if (argc - optind == 0 || (argc - optind == 1 && fn_in == 0)) {
    FILE *fp = stdout;
    fprintf(fp, "Usage: mnist-cnn [-i model] [-o model] [-t nThreads] <x.knd> [y.knd]\n");
    return 1;
  }

  kad_trap_fe();
  kann_srand(seed);
  if (fn_in) {
    ann = kann_load(fn_in);
  } else {
    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kann_layer_input(28); // 28x28
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_avg(1, &t);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
  }

  x = kann_data_read(argv[optind]);
  assert(x->n_col == 28 * 28);
  y = argc - optind >= 2? kann_data_read(argv[optind+1]) : 0;

  if (y) { // training
    assert(y->n_col == 10);
    if (n_threads > 1) kann_mt(ann, n_threads, mini_size);
    train_data *d;
    d = create_train_data(ann, x, y);
    train(ann, d, lr, mini_size, max_epoch, fn_out, n_threads);
    free(d);
    kann_data_free(y);
  } else { // applying
    int i, j, k, n_out;
    kann_switch(ann, 0);
    n_out = kann_dim_out(ann);
    assert(n_out == 10);
    for (i = 0; i < x->n_row; ++i) {
      const float *y;
      kann_rnn_start(ann);
      for(k = 0; k < 28; ++k) {
        float x1[28];
        memcpy(x1, &x->x[i][k * 28], sizeof(x1));
        y = kann_apply1(ann, x1);
      }
      if (x->rname) printf("%s\t", x->rname[i]);
      for (j = 0; j < n_out; ++j) {
        if (j) putchar('\t');
        printf("%.3g", y[j] + 1.0f - 1.0f);
      }
      putchar('\n');
      kann_rnn_end(ann);
    }
  }

  kann_data_free(x);
  kann_delete(ann);
  return 0;
}

It would be great to see any simple rnn classification example.

Exporting weights?

I am hoping to train a KANN model with a genetic algorithm, but in order to do this I will need to be able to get an array of network weights, and I did not see a way of doing this in the documentation. I could be missing something obvious though.

Batch processing of large dataset

Is it currently possible to process a large dataset in batches rather than loading it all into memory? Possibly it's simply a matter of calling kann_train_fnn1 with each batch?
Any clues most welcome.

ChrisP.

Can KANN train an embedding layer?

Or should I just use word2vec instead as a pre-processing step?
Again sorry for stupid questions, happy to read more docs if you can point me at them :-)
Thanks.

kann_layer_layernorm = batch normalization?

Hi,

Is kann_layer_layernorm the same as batch normalization?

Thank you

Questions about kann_apply1

Hi, I am a fresh guy in C++. My code had a bug when I used the libaray.
this is my input and label:

float xx[1][100][400];
float yy[1][100][3];
float **x = (float **)xx;
float **y = (float **)yy;

and then train the net:

kann_train_fnn1(ann, lr, batch_size, epoch, max_drop_streak, frac_val, 1, x, y);

when I tried to test the net, it's something wrong:

auto y1 = kann_apply1(ann, x->x[0]); // It caused a error here.

BTW, I didn't save the model between executing kann_train_fnn1() and kann_apply1().

xor example

Hi, here is my code:

// gcc xor.c ../kann.c ../kautodiff.c -I. -I../ -lm && ./a.out

#include "kann.h"

static kann_t *model_gen(int n_in, int n_out, int loss_type, int n_h_layers, int n_h_neurons)
{
  int i;
  kad_node_t *t;
  t = kann_layer_input(n_in);
  for (i = 0; i < n_h_layers; ++i)
    t = kad_relu(kann_layer_dense(t, n_h_neurons));
  return kann_new(kann_layer_cost(t, n_out, loss_type), 0);
}

static void train(kann_t *ann)
{
  enum { n = 4 };

  float *x[n] = {
    (float[]){ 0, 0, },
    (float[]){ 0, 1, },
    (float[]){ 1, 0, },
    (float[]){ 1, 1, },
  };

  float *y[n] = {
    (float[]){ 0, },
    (float[]){ 1, },
    (float[]){ 1, },
    (float[]){ 0, },
  };

  kann_train_fnn1(ann, 0.001f, 64, 10000, 10, 0.1f, n, x, y);
}

void predict(kann_t *ann)
{
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 0, 0 }), 0.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 0, 1 }), 1.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 1, 0 }), 1.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 1, 1 }), 0.0f);
}

int main(int argc, char *argv[])
{
  kann_t *ann = model_gen(2, 1, KANN_C_CEB, 1, 5);
  train(ann);
  predict(ann);
  kann_delete(ann);

  return 0;
}

Program output:

0.000902 | 0.000000
0.999955 | 1.000000
0.999937 | 1.000000
0.000029 | 0.000000

As far as i know xor requires 3 neurons in hidden layer not 5. Here is keras example:

model = Sequential()
model.add(Dense(3, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy'])
model.fit(training_data, target_data, epochs=10000, verbose=2)
print model.predict(training_data)
[[0.0073216]
 [0.9848797]
 [0.9848797]
 [0.0067511]]

Why 5 neurons ?

Why accuracy does not increase anymore at specific epoch?

Given MNIST CNN example, validation cost does not increase from approximately epoch 11. So, running more epochs is useless since validation cost will be only increasing or decreasing in near minimum cost value which is the validation cost of epoch 11. Could you explain why this happens and how to solve it? (I also tested using a variety of structures of CNN, but there was no big difference.)

VAE error

Example of VAE did not work and got an error:

 ./vae -i mnist-vae.kan -g 10 | ../kann-data/mnist-ascii.pl
vae: kautodiff.c:1043: kad_op_add: Assertion `n0 >= n1' failed.

Everything was done according to the document: https://github.com/attractivechaos/kann/tree/master/examples

A time series data example for LSTM

I have inputs (dimension 2) and outputs (1) sequence like below all numbers are normalized ( -1 to 1 )

below is copied 2 samples from training data

(-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) 0.03
(-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) -0.20

here last column is a output vector of size 1 and before that we have 5 unrolling pairs of data points, Can you pls point me how to write training routine for above example. I didn't understand much from your rnn-bit example is quite different use case and textgen is difficult to understand.

I tried like below, but i dont think its KANN_F_TRUTH array is correctly populated

for (int j = 0; j < num_rows - batch_size_; j += batch_size_) {

			int k;
			for (k = 0; k < ulen; ++k) {
				for (int b = 0; b < batch_size_; ++b) {
					int s = j + b;/// shuf[j + b];
					for (int i = 0; i < input_; ++i)
						x[k][b*input_ + i] = data.x[s][k][i];
					
					for (int i = 0; i < output_; ++i)
						y[k][b*output_ + i] = data.y[s][i]; // <--------------------------- some fix required here
				}
			}
			
				
			cost += kann_cost(ua, 0, 1) * ulen * batch_size_;
			n_cerr += kann_class_error(ua, &k);
			tot_base += k;
			//kad_check_grad(ua->n, ua->v, ua->n-1);
			kann_RMSprop(n_var, error, NULL, 0.9f, ua->g, ua->x, r);
			tot += ulen * batch_size_;
			
		}

Getting Started example not working on macOS 11.3, Apple M1 chip

Hello, people.

I just wanted to let you know that at least the Getting Started example doesn't seem to be working on macOS 11.3, Apple M1 chip, whereas it works correctly on my old machine, which is an Intel Macbook Pro running macOS 10.15.7.

If I print the size of int, float, and double types, I get the same results on both machines (4, 4, and 8).

This is the output when I try running the Getting Started example on BigSur:

dariosanfilippo@Darios-MBP kann % seq 30000 | awk -v m=10000 '{a=int(m*rand());b=int(m*rand());print a,b,a+b}' \
  | ./examples/rnn-bit -m7 -o add.kan -
epoch: 1; cost: 0.0587254 (class error: 2.59%)
epoch: 2; cost: 0.000135723 (class error: 0.00%)
epoch: 3; cost: 7.7552e-05 (class error: 0.00%)
epoch: 4; cost: 4.28615e-05 (class error: 0.00%)
epoch: 5; cost: 4.24452e-05 (class error: 0.00%)
epoch: 6; cost: 2.26656e-05 (class error: 0.00%)
epoch: 7; cost: 1.84629e-05 (class error: 0.00%)
dariosanfilippo@Darios-MBP kann % echo 400958 737471 | ./examples/rnn-bit -Ai add.kan -
1924146487037

Would you know what the issue might be?

Thank you so much for your help.

Dario

malloc ⇒ kann_new
pthread_mutex_init ⇒ kann_mt