Code Monkey home page Code Monkey logo

adalogn's Issues

How long for train?

hello, very valuable work, I train from sctrech, but I find it is very time costing.
batch size is 8, but one step need 13 min!

0%| | 1/2315 [13:20<514:50:48, 800.97s/it]

bert and dgl all in the gpu, 3080ti 12GB. Is it normal for this model?

Graphene usage

Hi authors,

Could you please tell me which mode that you used (server/cli) and how can i create FLAT files like you.

Thanks for you support.

关于验证和训练模型的问题

作者您好,复现该项目代码之后,有两个疑问。
第一,我使用了您在谷歌云盘上传的checkpoint对两个数据集进行了验证。其中,LogiQA的验证集和测试集、Reclor的验证集结果基本符合您的论文数据。对Reclor的测试集,我给test.json补充了标签,得到的测试结果是74.4%,远高于论文数据。
第二,我尝试在3块RTX8000上从头开始训练模型,参数和原代码中的一致。但是训练后模型在验证集与测试集上的正确率远低于预期,似乎loss也没有收敛。请问参数需要更改吗?
祝您新年快乐,工作顺利!

运行时,checkpoint和当前模型size不匹配

你好~
我在尝试运行eva的时候,出现以下错误:
RuntimeError: Error(s) in loading state_dict for Tagger:
size mismatch for drop_replacement: copying a param with shape torch.Size([2248]) from checkpoint, the shape in current model is torch.Size([325]).
希望能得到解答,谢谢~

GNNs.py

Hello, thanks for your work !
Is there something wrong in class RGAT's forward function of GNNs.py ?

In Section2.3.2 of the paper, (l+1)-th iteration is relative directly with l-th iteration. But in the code, I do not see any connections that output in each iteration is individual. Specifically, there are some local variables outside in the code. Should they be put inside the for ?
WechatIMG677

Graphenne提取工具

您好,可以给出您使用Graphene提取工具的graphene-core / graphene-server / graphene-cli中哪个模式和使用的命令吗?

您好,想问一下有关batchsize

我不是很懂,因为课程原因想运行一下您的代码
在服务器上运行时好像单个显卡运行时显存不够(11G显存),想修改batch size,在shell里面将两个batch size都修改为了1还是跑不动。
用两块以上的显卡一起跑的时候会有RuntimeError: NCCL Error 2: unhandled system error报错,不知怎么解决,万分感谢!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.