Code Monkey home page Code Monkey logo

spikegpt's Introduction

Hello. I'm Ridger (Rui-Jie) Zhu.

  • ๐Ÿ‘€ interested in spiking neural network and natural language processing
  • ๐ŸŒฑ currently learning at UC Santa Cruz, first-year Ph.D. student, supervised by Prof. Jason Eshraghian
  • ๐Ÿ“ซ How to reach me [email protected]

Ridger's GitHub stats

spikegpt's People

Contributors

eddiem3 avatar eltociear avatar jeshraghian avatar ridgerchu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spikegpt's Issues

SynOps Calculation

Hi @ridgerchu , first of all congratulations for your work, it is amazing. I would like to know how you exactly calculate the number for SynOps reported in your paper, as I do not get the same results. Look forward to hearing from you.

TypeError: object of type 'NoneType' has no len()

Obtaining this error at the end of training when training bar reaches 100%

TypeError: object of type 'NoneType' has no len()

Full error message

Traceback (most recent call last):
  File "path/pycharm-community-2021.3.3/plugins/python-ce/helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "path/pycharm-community-2021.3.3/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "path/PycharmProjects/LLM/train.py", line 137, in <module>
    trainer.train()
  File "path/PycharmProjects/LLM/src/trainer.py", line 183, in train
    run_epoch('valid')
  File "path/PycharmProjects/LLM/src/trainer.py", line 116, in run_epoch
    num_steps = len(loader)
  File "path/python3_venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 489, in __len__
    return len(self._index_sampler)
  File "path/python3_venv/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 265, in __len__
    return (len(self.sampler) + self.batch_size - 1) // self.batch_size  # type: ignore[arg-type]
  File "path/python3_venv/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 79, in __len__
    return len(self.data_source)
TypeError: object of type 'NoneType' has no len()

Whether have a batch inference file?

I wonder that whether there have a batch inference code?
It seems like I can only input one context for one time.
That would be nice if u can provided a file like batch_run.py or something else. Thx!

Training setup

Hi, I'm attempting to replicate the training runs with all the different datasets. Could you provide some insight into the configuration that you used to train all three of the datasets you mentioned in the paper?

Thanks in advance!

Using [Vit] with SpikeGPT model

Hello everyone ,
It is fantastic to see your great job you have done .
Is it possible to leverage Vit model (or it can be any) for image understanding with SpikeGPT in Sequence to Sequence task for image captions task ?

Linking paper and code

Hi, I had a couple of questions on the paper as well as the link to the code here.

  1. Do you have any materials on how you derived of Eq.10 in the paper from Eq.4?
  2. I'm also a little unclear how the CUDA function "kernel_forward" in wkv_cuda.cu implements the Eq.10 - could you provide some pointers around that please?

Thanks!

Access to downstream task finetuned models

Hi, have you open-sourced the models that you used for the perplexity values quoted in (https://arxiv.org/abs/2302.13939)? For instance do you have the wikitext-2 and wikitext-103 models open source anywhere?

Alternatively, in order to create a custom model to reproduce those results, should I start with the provided 216M model trained on OpenWebText and finetune it on wikitext using the provided train.py script?

Thanks!

Adversarial attack

Hello, I want to do SNN image classification against attacks, and I would like to consult spikegpt as the attacked network architecture? My data shape is (T,N,C,H,W), T is the number of frames, N is batch, C is channel, H is height, and W is width. What should I change if I can? Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.