Code Monkey home page Code Monkey logo

kann's People

Contributors

alperyilmaz avatar attractivechaos avatar gareins avatar lh3 avatar timgates42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kann's Issues

Adding CUDA support ?

Hi,

Is there any plan to add CUDA support in the near future ? It will be very useful if we want to train some medium size network. It will also be very attractive for platforms like Tegra TK1, etc. Libraries like caffe and mxnet rely on too many libraries. Sometimes it will consume too much time to resolve these libraries conflicts during installation.

Custom loss function example using kad_op functions

Hi,

I'm trying to implement a custom loss function with a simple MLP.
Is there an example of using the kad_op functions to accomplish this so that I benefit from automatic differentiation?
I don't want to explicitly write the backward computation as is the case for the currently implemented loss functions (mse, ce, etc).

Or is this approach not feasible (for memory consumption reasons) as it will require the computation and storage of the gradients for each operation in the loss function?

I'd greatly appreciate any help/feedback/example!

Thanks!

What does "va_start" mean in kann ?

IN kann.c line 521:
va_start(ap, n_d); for (i = 0; i < n_d; ++i) d[i] = va_arg(ap, int); va_end(ap)
my question is what is this uesed for?

kann_layer_linear

I have not been able to find kann_layer_linear definition in the source code.

It is from the example (A complete example) given on the page.

t = kad_relu(kann_layer_linear(t, 64));

It's not that I need it right now, but, it is the first thing I tried running.

Model inference on ARM M4F

Hi there.
Great work on the project! I have made successful progress with mac os. But I was wondering if the trained model, say minst-cnn.kan, could be transferred to ARM M4F chip for inference? There would definitely be data input and output processing. But apart from that, is it possible to use the trained model on ARM? Thanks in advance!

Image classification of cats, dogs, and pandas

I am attempting to use KANN to classify cats, dogs, and pandas. I have the data pre-processed such that it reads every image in as RGB byte, scales to float, and resizes to 32x32 (3 channel still). I store the image and labels as float **x, float **y, where the dimension of x is [nsamples][32x32x3] (formatted as a flattened array with rows = ncols * 3 laid out as RGB per pixel) and y is [nsamples][3] (for the 3 classes of cat/dog/panda). I split my data into a 75% train and send it into a modified version of the "Complete Example" provided:

int train_kann(float **x, int nrows, int ncols, int nbands, float **y, int nclasses, int n_samples)
{
	int max_bit, i;
	kad_node_t *t;
	kann_t *ann;

	max_bit = nrows * ncols * nbands;

	// construct an MLP with one hidden layers
	t = kann_layer_input(max_bit);
	t = kad_relu(kann_layer_dense(t, 64));
	t = kann_layer_cost(t, nclasses, KANN_C_CEM); // output uses 1-hot encoding
	ann = kann_new(t, 0);

	// train
	kann_train_fnn1(ann, 0.001f, 64, 50, 10, 0.1f, n_samples, x, y);

	return 0;
}

However I am getting some strange output from kann_train_fnn1. It is not reporting the class error in training or validation, so I am getting n_train_base == 0 and n_val == 0 (meaning no class error?).

epoch: 1; training cost: 13.2655; validation cost: 13.8155
epoch: 2; training cost: 13.8112; validation cost: 13.8155
epoch: 3; training cost: 13.8155; validation cost: 13.8155
epoch: 4; training cost: 13.8155; validation cost: 13.8155
(repeats these values for the remaining epochs)

I have a feeling this is an issue of how I set up my data and labels. Any help would be greatly appreciated.

Can we define the neuron number for each layer?

Hi,

From the code for MLP, I can see that I can set the hidden layer but all the neuron numbers for each layer are the same. Are any methods to define a different number of neurons for different layers?

Cheers,
Travis

Convolutional recurrent neural network

I want to combine convolutional layer with recurrent one. This code is based on #19:

    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kad_feed(3, 1, 1, 28), t->ext_flag |= KANN_F_IN;
    t = kad_relu(kann_layer_conv1d(t, 32, 3, 1, 0)); // 3 kernel; 1 stride; 0 padding
    t = kann_layer_dropout(t, dropout);
    t = kad_max1d(t, 2, 2, 0); // 2 kernel; 2 stride; 0 padding
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_select(1, &t, -1);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
    kad_print_graph(stdout, ann->n, ann->v);

It works:

./mnist-crnn -i mnist-crnn.kan kann-data/mnist-test-x.knd | kann-data/mnist-eval.pl
Error rate: 1.19%

Questions:

  • i stumbled across same problem #6 at first, then i replaced kann_layer_input to kad_feed(3, 1, 1, 28) to make it work, but numbers 1, 1 still looks like magic to me... Are they correct ?

  • does backprop work correctly for conv1d on unrolled rnn ?

Whole code:

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "kann_extra/kann_data.h"
#include "kann.h"

typedef struct {
  int n_in, n_out, ulen, n;
  float **x, **y;
} train_data;

static void train(kann_t *ann, train_data *d, float lr, int mini_size, int max_epoch, const char *fn, int n_threads)
{
  float **x, **y, *r, best_cost = 1e30f;
  int epoch, j, n_var, *shuf;
  kann_t *ua;

  n_var = kann_size_var(ann);
  r = (float*)calloc(n_var, sizeof(float));
  x = (float**)malloc(d->ulen * sizeof(float*));
  y = (float**)malloc(1 * sizeof(float*));
  for (j = 0; j < d->ulen; ++j) {
    x[j] = (float*)calloc(mini_size * d->n_in, sizeof(float));
  }
  y[0] = (float*)calloc(mini_size * d->n_out, sizeof(float));
  shuf = (int*)calloc(d->n, sizeof(int));

  ua = kann_unroll(ann, d->ulen);
  kann_set_batch_size(ua, mini_size);
  kann_mt(ua, n_threads, mini_size);
  kann_feed_bind(ua, KANN_F_IN,    0, x);
  kann_feed_bind(ua, KANN_F_TRUTH, 0, y);
  kann_switch(ua, 1);
  for (epoch = 0; epoch < max_epoch; ++epoch) {
    kann_shuffle(d->n, shuf);
    double cost = 0.0;
    int tot = 0, tot_base = 0, n_cerr = 0;
    for (j = 0; j < d->n - mini_size; j += mini_size) {
      int b, k;
      for (b = 0; b < mini_size; ++b) {
        int s = shuf[j + b];
        for (k = 0; k < d->ulen; ++k) {
          memcpy(&x[k][b * d->n_in], &d->x[s][k * d->n_in], d->n_in * sizeof(float));
        }
        memcpy(&y[0][b * d->n_out], d->y[s], d->n_out * sizeof(float));
      }
      cost += kann_cost(ua, 0, 1) * d->ulen * mini_size;
      n_cerr += kann_class_error(ua, &k);
      tot_base += k;
      //kad_check_grad(ua->n, ua->v, ua->n-1);
      kann_RMSprop(n_var, lr, 0, 0.9f, ua->g, ua->x, r);
      tot += d->ulen * mini_size;
    }
    if (cost < best_cost) {
      best_cost = cost;
      if (fn) kann_save(fn, ann);
    }
    fprintf(stderr, "epoch: %d; cost: %g (class error: %.2f%%)\n", epoch+1, cost / tot, 100.0f * n_cerr / tot_base);
  }

  kann_delete_unrolled(ua);

  for (j = 0; j < d->ulen; ++j) {
    free(x[j]);
  }
  free(y[0]); free(y); free(x); free(r); free(shuf);
}

static train_data* create_train_data(kann_t *ann, kann_data_t *x, kann_data_t *y)
{
  train_data *d;
  d = (train_data*)malloc(sizeof(*d));
  assert(d);
  assert(x->n_row == y->n_row);
  d->x = x->x;
  d->y = y->x;
  d->ulen = 28; // 28x28
  d->n = x->n_row;
  d->n_in = kann_dim_in(ann);
  d->n_out = kann_dim_out(ann);
  return d;
}

int main(int argc, char *argv[])
{
  kann_t *ann;
  kann_data_t *x, *y;
  char *fn_in = 0, *fn_out = 0;
  int c, i, mini_size = 64, max_epoch = 50, seed = 84, n_h_layers = 1, n_h_neurons = 64, norm = 1, n_h_flt = 32, n_threads = 1;
  float lr = 0.001f, dropout = 0.2f;

  while ((c = getopt(argc, argv, "i:o:m:l:n:d:s:t:N")) >= 0) {
    if (c == 'i') fn_in = optarg;
    else if (c == 'o') fn_out = optarg;
    else if (c == 'm') max_epoch = atoi(optarg);
    else if (c == 'l') n_h_layers = atoi(optarg);
    else if (c == 'n') n_h_neurons = atoi(optarg);
    else if (c == 'd') dropout = atof(optarg);
    else if (c == 's') seed = atoi(optarg);
    else if (c == 't') n_threads = atoi(optarg);
    else if (c == 'N') norm = 0;
  }

  if (argc - optind == 0 || (argc - optind == 1 && fn_in == 0)) {
    FILE *fp = stdout;
    fprintf(fp, "Usage: mnist-cnn [-i model] [-o model] [-t nThreads] <x.knd> [y.knd]\n");
    return 1;
  }

  kad_trap_fe();
  kann_srand(seed);
  if (fn_in) {
    ann = kann_load(fn_in);
  } else {
    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kad_feed(3, 1, 1, 28), t->ext_flag |= KANN_F_IN;
    t = kad_relu(kann_layer_conv1d(t, 32, 3, 1, 0)); // 3 kernel; 1 stride; 0 padding
    t = kann_layer_dropout(t, dropout);
    t = kad_max1d(t, 2, 2, 0); // 2 kernel; 2 stride; 0 padding
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_select(1, &t, -1);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
    kad_print_graph(stdout, ann->n, ann->v);
  }

  x = kann_data_read(argv[optind]);
  assert(x->n_col == 28 * 28);
  y = argc - optind >= 2? kann_data_read(argv[optind+1]) : 0;

  if (y) { // training
    assert(y->n_col == 10);
    if (n_threads > 1) kann_mt(ann, n_threads, mini_size);
    train_data *d;
    d = create_train_data(ann, x, y);
    train(ann, d, lr, mini_size, max_epoch, fn_out, n_threads);
    free(d);
    kann_data_free(y);
  } else { // applying
    int i, j, k, n_out;
    kann_switch(ann, 0);
    n_out = kann_dim_out(ann);
    assert(n_out == 10);
    for (i = 0; i < x->n_row; ++i) {
      const float *y;
      kann_rnn_start(ann);
      for(k = 0; k < 28; ++k) {
        float x1[28];
        memcpy(x1, &x->x[i][k * 28], sizeof(x1));
        y = kann_apply1(ann, x1);
      }
      if (x->rname) printf("%s\t", x->rname[i]);
      for (j = 0; j < n_out; ++j) {
        if (j) putchar('\t');
        printf("%.3g", y[j] + 1.0f - 1.0f);
      }
      putchar('\n');
      kann_rnn_end(ann);
    }
  }

  kann_data_free(x);
  kann_delete(ann);
  return 0;
}

License Type

Hello,

I am looking for a lightweight and standalone framework for deep learning, and this one looks like it could match my needs.
What licensing type is covering the source code?
MIT? BSD-3?

Thanks,
Mathieu

Format of model file

I would like to use a model that is pre-trained on Keras or Tensorflow, and run it on kann.
I am trying to look for the file format that the weights needs to be saved in to load it from with kann.
Please advise.

resnet

I want apply it to resent, so I write

kad_node_t *basic_block(kad_node_t *x, int channel)
{
	kad_node_t *y = kad_relu(kann_layer_conv2d(x, channel, 3, 3, 1, 1, 1, 1));
	y = kann_layer_conv2d(y, channel, 3, 3, 1, 1, 1, 1);
	return kad_relu(kad_add(x, y));
}

but it will arise segmentation fault

Transfrom script for dataset

Hi,
I could download dataset you provided. Can I get your transformation script so that we can properly convert some dataset to load to kann? (There is no such script in the kann repository)

Thank you

Possible Conv1D and Max1D Issue

Hi there.
I am dealing with 1D signal and hence I have modified the mnist-cnn.c example and changed the model like shown below:

kad_node_t *t;
t = kann_layer_input(200);
t = kad_relu(kann_layer_conv1d(t, n_h_flt, 5, 1, 2));
t = kad_max1d(t, 2, 1, 1);
t = kad_relu(kann_layer_conv1d(t, n_h_flt, 5, 1, 2));
t = kad_max1d(t, 2, 1, 1);
t = kann_layer_dense(t, 200);
t = kad_relu(t);
t = kann_layer_dense(t, 100);
t = kad_relu(t);
t = kann_layer_dense(t, 50);
t = kad_relu(t);
ann = kann_new(kann_layer_cost(t, 2, KANN_C_CEB), 0);

I added the padding so as to keep the length. My input is 200 in length and output is a simple true/false.
However, when I compile and run training on this, the terminal immediately shows "Segmentation fault: 11." I believe the model to be correct, so I suspect the issue is from 1d-related function?

Thanks in advance!

Why isn't KANN scalable?

Hi,
I was wondering about the statement in the README - "KANN is not as scalable, but it is close in flexibility, has a much smaller code base and only depends on the standard C library.".

Why isn't KANN scalable and why isn't it it suitable for training deeper networks?

More examples (classify text)

Hi, basically I'm hoping someone can point me to (or create) an example of a text sentiment conv based classifier using KANN. Doesn't need to be exact just something I can use as a base to start from. Something like this example from keras, even just the core model code would give me a starting point...

embedding_dim = 100
model = Sequential()
model.add(layers.Embedding(vocab_size, embedding_dim, input_length=maxlen))
model.add(layers.Conv1D(128, 5, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(10, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()

(example from) https://realpython.com/python-keras-text-classification/

`// here is my guess... but it's pure guess work so may be nonsense :-)

kann_t *model_gen_classify(int n_h_flt, int n_h_fc)
{
int wordsize=10; // lets assume word embedding
int sentence=100; // length of each sentence to analyze
int wordgroup[]={3,4,5}; // group words in 3,4,5 word groups in conv layers.
float dropout = 0.2f;
kann_t *ann;
kad_node_t *t;
t = kad_feed(4, 1, 1, sentence, wordsize), t->ext_flag |= KANN_F_IN;
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[0], wordsize, 1, 1, 0, 0));
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[1], wordsize, 1, 1, 0, 0));
t = kad_relu(kann_layer_conv1d(t, n_h_flt, wordgroup[2], wordsize, 1, 1, 0, 0));
t = kann_layer_dropout(t, dropout);
t = kann_layer_dense(t, n_h_fc);
t = kad_relu(t);
ann = kann_new(kann_layer_cost(t,1, KANN_C_CEB), 0);
return ann;
}
`

mnist-cnn example fail assert on training

I'm not sure if I'm missing something but I tried the `mnist-cnn example in the same ways as the README, but the training fails an assert on line 51:

assert(x->n_col == 28 * 28);

I printed x->n_col and the result is 0. I'm not sure if the problem is the code or in the data (I got the data from this repo as well, as stated in the examples readme file).

I tried removing the assert but it naturally just segfaults elsewhere.

The mlp example works fine, so I assume it isn't the data.

RNN classification example

When classify a sequence, we would like the network to have one output, instead of a sequence of output. According to 01user.md to classify a sequence kad_avg was mentioned. I tried this on mnist. It works but
i am not sure how to train such network. During training process we don't even know output values other then last one. In this line memcpy(&y[k][b * d->n_out], d->y[s], d->n_out * sizeof(float)); each y in sequence of output has same value d->y[s] which looks strange.

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "kann_extra/kann_data.h"
#include "kann.h"

typedef struct {
  int n_in, n_out, ulen, n;
  float **x, **y;
} train_data;

static void train(kann_t *ann, train_data *d, float lr, int mini_size, int max_epoch, const char *fn, int n_threads)
{
  float **x, **y, *r, best_cost = 1e30f;
  int epoch, j, n_var, *shuf;
  kann_t *ua;

  n_var = kann_size_var(ann);
  r = (float*)calloc(n_var, sizeof(float));
  x = (float**)malloc(d->ulen * sizeof(float*));
  y = (float**)malloc(d->ulen * sizeof(float*));
  for (j = 0; j < d->ulen; ++j) {
    x[j] = (float*)calloc(mini_size * d->n_in, sizeof(float));
    y[j] = (float*)calloc(mini_size * d->n_out, sizeof(float));
  }
  shuf = (int*)calloc(d->n, sizeof(int));

  ua = kann_unroll(ann, d->ulen);
  kann_set_batch_size(ua, mini_size);
  kann_mt(ua, n_threads, mini_size);
  kann_feed_bind(ua, KANN_F_IN,    0, x);
  kann_feed_bind(ua, KANN_F_TRUTH, 0, y);
  kann_switch(ua, 1);
  for (epoch = 0; epoch < max_epoch; ++epoch) {
    kann_shuffle(d->n, shuf);
    double cost = 0.0;
    int tot = 0, tot_base = 0, n_cerr = 0;
    for (j = 0; j < d->n - mini_size; j += mini_size) {
      int b, k;
      for (k = 0; k < d->ulen; ++k) {
        for (b = 0; b < mini_size; ++b) {
          int s = shuf[j + b];
          memcpy(&x[k][b * d->n_in], &d->x[s][k * d->n_in], d->n_in * sizeof(float));
          memcpy(&y[k][b * d->n_out], d->y[s], d->n_out * sizeof(float));
        }
      }
      cost += kann_cost(ua, 0, 1) * d->ulen * mini_size;
      n_cerr += kann_class_error(ua, &k);
      tot_base += k;
      //kad_check_grad(ua->n, ua->v, ua->n-1);
      kann_RMSprop(n_var, lr, 0, 0.9f, ua->g, ua->x, r);
      tot += d->ulen * mini_size;
    }
    if (cost < best_cost) {
      best_cost = cost;
      if (fn) kann_save(fn, ann);
    }
    fprintf(stderr, "epoch: %d; cost: %g (class error: %.2f%%)\n", epoch+1, cost / tot, 100.0f * n_cerr / tot_base);
  }

  kann_delete_unrolled(ua);

  for (j = 0; j < d->ulen; ++j) {
    free(y[j]); free(x[j]);
  }
  free(y); free(x); free(r); free(shuf);
}

static train_data* create_train_data(kann_t *ann, kann_data_t *x, kann_data_t *y)
{
  train_data *d;
  d = (train_data*)malloc(sizeof(*d));
  assert(d);
  assert(x->n_row == y->n_row);
  d->x = x->x;
  d->y = y->x;
  d->ulen = 28; // 28x28
  d->n = x->n_row;
  d->n_in = kann_dim_in(ann);
  d->n_out = kann_dim_out(ann);
  return d;
}

int main(int argc, char *argv[])
{
  kann_t *ann;
  kann_data_t *x, *y;
  char *fn_in = 0, *fn_out = 0;
  int c, i, mini_size = 64, max_epoch = 50, seed = 84, n_h_layers = 1, n_h_neurons = 64, norm = 1, n_threads = 1;
  float lr = 0.001f, dropout = 0.2f;

  while ((c = getopt(argc, argv, "i:o:m:l:n:d:s:t:N")) >= 0) {
    if (c == 'i') fn_in = optarg;
    else if (c == 'o') fn_out = optarg;
    else if (c == 'm') max_epoch = atoi(optarg);
    else if (c == 'l') n_h_layers = atoi(optarg);
    else if (c == 'n') n_h_neurons = atoi(optarg);
    else if (c == 'd') dropout = atof(optarg);
    else if (c == 's') seed = atoi(optarg);
    else if (c == 't') n_threads = atoi(optarg);
    else if (c == 'N') norm = 0;
  }

  if (argc - optind == 0 || (argc - optind == 1 && fn_in == 0)) {
    FILE *fp = stdout;
    fprintf(fp, "Usage: mnist-cnn [-i model] [-o model] [-t nThreads] <x.knd> [y.knd]\n");
    return 1;
  }

  kad_trap_fe();
  kann_srand(seed);
  if (fn_in) {
    ann = kann_load(fn_in);
  } else {
    kad_node_t *t;
    int rnn_flag = KANN_RNN_VAR_H0;
    if (norm) rnn_flag |= KANN_RNN_NORM;
    t = kann_layer_input(28); // 28x28
    for (i = 0; i < n_h_layers; ++i) {
      t = kann_layer_gru(t, n_h_neurons, rnn_flag);
      t = kann_layer_dropout(t, dropout);
    }
    t = kad_avg(1, &t);
    ann = kann_new(kann_layer_cost(t, 10, KANN_C_CEB), 0);
  }

  x = kann_data_read(argv[optind]);
  assert(x->n_col == 28 * 28);
  y = argc - optind >= 2? kann_data_read(argv[optind+1]) : 0;

  if (y) { // training
    assert(y->n_col == 10);
    if (n_threads > 1) kann_mt(ann, n_threads, mini_size);
    train_data *d;
    d = create_train_data(ann, x, y);
    train(ann, d, lr, mini_size, max_epoch, fn_out, n_threads);
    free(d);
    kann_data_free(y);
  } else { // applying
    int i, j, k, n_out;
    kann_switch(ann, 0);
    n_out = kann_dim_out(ann);
    assert(n_out == 10);
    for (i = 0; i < x->n_row; ++i) {
      const float *y;
      kann_rnn_start(ann);
      for(k = 0; k < 28; ++k) {
        float x1[28];
        memcpy(x1, &x->x[i][k * 28], sizeof(x1));
        y = kann_apply1(ann, x1);
      }
      if (x->rname) printf("%s\t", x->rname[i]);
      for (j = 0; j < n_out; ++j) {
        if (j) putchar('\t');
        printf("%.3g", y[j] + 1.0f - 1.0f);
      }
      putchar('\n');
      kann_rnn_end(ann);
    }
  }

  kann_data_free(x);
  kann_delete(ann);
  return 0;
}

It would be great to see any simple rnn classification example.

Exporting weights?

I am hoping to train a KANN model with a genetic algorithm, but in order to do this I will need to be able to get an array of network weights, and I did not see a way of doing this in the documentation. I could be missing something obvious though.

Batch processing of large dataset

Is it currently possible to process a large dataset in batches rather than loading it all into memory? Possibly it's simply a matter of calling kann_train_fnn1 with each batch?
Any clues most welcome.

ChrisP.

Can KANN train an embedding layer?

Or should I just use word2vec instead as a pre-processing step?
Again sorry for stupid questions, happy to read more docs if you can point me at them :-)
Thanks.

Questions about kann_apply1

Hi, I am a fresh guy in C++. My code had a bug when I used the libaray.
this is my input and label:

float xx[1][100][400];
float yy[1][100][3];
float **x = (float **)xx;
float **y = (float **)yy;

and then train the net:

kann_train_fnn1(ann, lr, batch_size, epoch, max_drop_streak, frac_val, 1, x, y);

when I tried to test the net, it's something wrong:

auto y1 = kann_apply1(ann, x->x[0]); // It caused a error here.

BTW, I didn't save the model between executing kann_train_fnn1() and kann_apply1().

xor example

Hi, here is my code:

// gcc xor.c ../kann.c ../kautodiff.c -I. -I../ -lm && ./a.out

#include "kann.h"

static kann_t *model_gen(int n_in, int n_out, int loss_type, int n_h_layers, int n_h_neurons)
{
  int i;
  kad_node_t *t;
  t = kann_layer_input(n_in);
  for (i = 0; i < n_h_layers; ++i)
    t = kad_relu(kann_layer_dense(t, n_h_neurons));
  return kann_new(kann_layer_cost(t, n_out, loss_type), 0);
}

static void train(kann_t *ann)
{
  enum { n = 4 };

  float *x[n] = {
    (float[]){ 0, 0, },
    (float[]){ 0, 1, },
    (float[]){ 1, 0, },
    (float[]){ 1, 1, },
  };

  float *y[n] = {
    (float[]){ 0, },
    (float[]){ 1, },
    (float[]){ 1, },
    (float[]){ 0, },
  };

  kann_train_fnn1(ann, 0.001f, 64, 10000, 10, 0.1f, n, x, y);
}

void predict(kann_t *ann)
{
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 0, 0 }), 0.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 0, 1 }), 1.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 1, 0 }), 1.0f);
  printf("%f | %f\n", *kann_apply1(ann, (float[]){ 1, 1 }), 0.0f);
}

int main(int argc, char *argv[])
{
  kann_t *ann = model_gen(2, 1, KANN_C_CEB, 1, 5);
  train(ann);
  predict(ann);
  kann_delete(ann);

  return 0;
}

Program output:

0.000902 | 0.000000
0.999955 | 1.000000
0.999937 | 1.000000
0.000029 | 0.000000

As far as i know xor requires 3 neurons in hidden layer not 5. Here is keras example:

model = Sequential()
model.add(Dense(3, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy'])
model.fit(training_data, target_data, epochs=10000, verbose=2)
print model.predict(training_data)
[[0.0073216]
 [0.9848797]
 [0.9848797]
 [0.0067511]]

Why 5 neurons ?

Why accuracy does not increase anymore at specific epoch?

Given MNIST CNN example, validation cost does not increase from approximately epoch 11. So, running more epochs is useless since validation cost will be only increasing or decreasing in near minimum cost value which is the validation cost of epoch 11. Could you explain why this happens and how to solve it? (I also tested using a variety of structures of CNN, but there was no big difference.)

A time series data example for LSTM

I have inputs (dimension 2) and outputs (1) sequence like below all numbers are normalized ( -1 to 1 )

below is copied 2 samples from training data

(-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) (-0.70,-0.23) 0.03
(-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) (-0.61,-0.26) -0.20

here last column is a output vector of size 1 and before that we have 5 unrolling pairs of data points, Can you pls point me how to write training routine for above example. I didn't understand much from your rnn-bit example is quite different use case and textgen is difficult to understand.

I tried like below, but i dont think its KANN_F_TRUTH array is correctly populated

for (int j = 0; j < num_rows - batch_size_; j += batch_size_) {

			int k;
			for (k = 0; k < ulen; ++k) {
				for (int b = 0; b < batch_size_; ++b) {
					int s = j + b;/// shuf[j + b];
					for (int i = 0; i < input_; ++i)
						x[k][b*input_ + i] = data.x[s][k][i];
					
					for (int i = 0; i < output_; ++i)
						y[k][b*output_ + i] = data.y[s][i]; // <--------------------------- some fix required here
				}
			}
			
				
			cost += kann_cost(ua, 0, 1) * ulen * batch_size_;
			n_cerr += kann_class_error(ua, &k);
			tot_base += k;
			//kad_check_grad(ua->n, ua->v, ua->n-1);
			kann_RMSprop(n_var, error, NULL, 0.9f, ua->g, ua->x, r);
			tot += ulen * batch_size_;
			
		}

Getting Started example not working on macOS 11.3, Apple M1 chip

Hello, people.

I just wanted to let you know that at least the Getting Started example doesn't seem to be working on macOS 11.3, Apple M1 chip, whereas it works correctly on my old machine, which is an Intel Macbook Pro running macOS 10.15.7.

If I print the size of int, float, and double types, I get the same results on both machines (4, 4, and 8).

This is the output when I try running the Getting Started example on BigSur:

dariosanfilippo@Darios-MBP kann % seq 30000 | awk -v m=10000 '{a=int(m*rand());b=int(m*rand());print a,b,a+b}' \
  | ./examples/rnn-bit -m7 -o add.kan -
epoch: 1; cost: 0.0587254 (class error: 2.59%)
epoch: 2; cost: 0.000135723 (class error: 0.00%)
epoch: 3; cost: 7.7552e-05 (class error: 0.00%)
epoch: 4; cost: 4.28615e-05 (class error: 0.00%)
epoch: 5; cost: 4.24452e-05 (class error: 0.00%)
epoch: 6; cost: 2.26656e-05 (class error: 0.00%)
epoch: 7; cost: 1.84629e-05 (class error: 0.00%)
dariosanfilippo@Darios-MBP kann % echo 400958 737471 | ./examples/rnn-bit -Ai add.kan -
1924146487037

Would you know what the issue might be?

Thank you so much for your help.

Dario

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.