Code Monkey home page Code Monkey logo

data-science-competition's Introduction

Hi there 👋

About Me.

  • 🌴 I'm now a AI algorithm engineer at Alibaba Group.
  • 🌱 I graduated from Xi'an Jiaotong University with a master's degree.
  • ⚡ I am a data science competition enthusiast.
  • 🐝 Now I'm very interested in large language models.I have experience in pre-training LLM at a scale of tens of billions.
  • 📫 Wechat:qq2257164884 QQ🐧:2257164884.
  • 🍀 6 times top10 in Ali-Tianchi 7 silver medals in kaggle

Anurag's GitHub stats

data-science-competition's People

Contributors

dllxw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-science-competition's Issues

天池语义分割代码求教

您好,我尝试使用您的baseline,现想去掉混合精度训练,如图,但是损失变为nan,不知道具体原因,请大佬解惑。
微信图片_20210219114030

GAIIC: 2022 赛道1竞赛开源代码5折CV疑问

看到GAIIC:2022 赛道1竞赛您开源的代码,您的train.py里使用了5折交叉验证的方法训练。每折都重新初始化模型权重,优化器等,然后每折存储loss最低的模型。这样一来,您5折的话,就有5个best 模型。但是,您最后在infer.py只使用了第一折的模型进行推理。那请问您5折的意思是什么呢?如果使用5折,并且按照大赛只允许单模的规则,那您这5个最好模型该如何去利用进行推理,还麻烦大佬答疑下

训练的map为0

请问在训练的时候map一直是0,是什么原因,源码没有改动,按照步骤取操作的,切图划分数据,但是训练的时候map为0,测试的时候map=[]
@DLLXW

2021天池瓷砖表面瑕疵 切图优化

原本的切图是单线程的,全部切完得半个小时,我改成了多线程的了,全部跑完五分钟左右(3700x,8C16T,开16个线程),请问可以pull requests吗

线上线下分数不同问题

全球人工智能技术创新大赛【赛道一】线上线下分数的差别可能不来自于计算方式;可能是举办方故意按比例选了更难分辨的样本集,不知道大佬对这种情况可有解题思路呀?

image

天气情况图像分类

由于数据集中存在大量的干扰背景,真正对最终分类结果影响较大的区域占比较小,因此团队在原始的swsl_resnext101_32x8d的基础上添加注意力机制模块CBAM,让模型能够更有效的学到关键区域。

请问这个加入CBAM的代码在哪里?没有找到。

wechat 2021

按照readme运行部分运行前三个命令,使用lgb生成的submit文件与test_a.csv的(userid, feedid)不一致

关于数据标签减一的问题

作者您好,您说“将原始mask数据整体-1保存,或者在dataloader里面-1”。我没有找到相关对标签减一的代码,RSCDdataset.py和make_dataset.py中都没有找到mask-1的操作,您能告知具体位置吗?能麻烦您稍作指点吗?感谢!

关于切图的错误

0%| | 0/5388 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Tianchi\yolov5_material\yolov5-master\make_slice_voc.py", line 223, in
slice_im(image_path, ann_path, out_name, outdir, sliceHeight=1024, sliceWidth=1024)
File "D:\Tianchi\yolov5_material\yolov5-master\make_slice_voc.py", line 194, in slice_im
make_slice_voc(outpath,exiset_obj_list,sliceHeight,sliceWidth)
File "D:\Tianchi\yolov5_material\yolov5-master\make_slice_voc.py", line 85, in make_slice_voc
with codecs.open(os.path.join(slice_voc_dir, name[:-4] + '.xml'), 'w', 'utf-8') as xml:
File "D:\Anaconda3\envs\yolov5\lib\codecs.py", line 904, in open
file = builtins.open(filename, mode, buffering)
OSError: [Errno 22] Invalid argument: './slice/annotations\JPEGImages\0|819_3276_1024_1024_0_8192_6000.xml'
请按任意键继续. . .
屏幕截图 2021-01-20 170445

关于预测部分

用切好的图,来预测,如果真实的框,正好在切图的边缘,如何来处理他,更加准确的预测

make_slice_voc报错

image
Traceback (most recent call last):
File "d:/Users/ZHT/PycharmProjects/yolov5_瓷砖瑕疵检测/convert_utils/make_slice_voc.py", line 211, in
slice_im(image_path, ann_path, out_name, outdir, sliceHeight=640, sliceWidth=640)
File "d:/Users/ZHT/PycharmProjects/yolov5_瓷砖瑕疵检测/convert_utils/make_slice_voc.py", line 195, in slice_im
make_slice_voc(outpath,exiset_obj_list,sliceHeight,sliceWidth)
File "d:/Users/ZHT/PycharmProjects/yolov5_瓷砖瑕疵检测/convert_utils/make_slice_voc.py", line 86, in make_slice_voc
with codecs.open(os.path.join(slice_voc_dir, name[:-4] + '.xml'), 'w', 'utf-8') as xml:
File "D:\Users\ZHT\anaconda3\envs\pytorch\lib\codecs.py", line 905, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: './slice/annotations_demo\JPEGImages_demo\qyl|1024_5120_640_640_0_8192_6000.xml'

CUDA error: out of memory

你好,显卡GTX2080Ti,设置batch_size为2了,为什么运行train.py出现
Traceback (most recent call last):
File "F:/YL_file/compite/yoloV5/yolov5/train.py", line 395, in
train(hyp)
File "F:/YL_file/compite/yoloV5/yolov5/train.py", line 81, in train
model = Model(opt.cfg, nc=data_dict['nc']).to(device)
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 432, in to
return self._apply(convert)
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply
module._apply(fn)
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply
module._apply(fn)
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 230, in _apply
param_applied = fn(param)
File "G:\vision\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 430, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

关于环境

感谢开源,能介绍一下基础环境吗 如torch版本

切图步长

image
您好,我有个困惑,就是切图大小为640,那步长是不是也是640而不是512,虽然感觉是小儿科问题

关于totensor的问题

您好,谢谢您的分享!
请问在data-science-competition/kaggle/Cassava Leaf Disease Classification/里,数据增强用的是ToTensorV2,为什么不用ToTensor,而是ToTensorV2

convert voc to coco

广东2021
每次运行voc_to_coco.py只能处理538个数据就会报错啊
Traceback (most recent call last):
File "voc_to_coco.py", line 148, in
os.path.join(root_path, 'coco/val2017', img_name))
File "/home/zhj/anaconda3/lib/python3.6/shutil.py", line 241, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/zhj/anaconda3/lib/python3.6/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: './voc/JPEGImages/232_31_t20201127084905952_CAM2.jpg'

wechat2021

你的模型线上只有0.60。能问一下为什么吗?

SWA

使用SWA的时候,我想每间隔5个epoch保存一个权重,opt.swap_swa_sgd()和opt.bn_update(train_loader, model)要放在什么位置?pytorch文档里面说放在训练完成之后,这样的话就只能是最后一个epoch的权重保存了SWA,其他的没有啊

关于Shopee-Price Match Guarantee竞赛数据处理的问题

您好,小白想请教一个问题:我有考虑过依据数据的 label_group 对数据进行采样生成类似于 (image1, image2, label)的数据,然后进行二分类训练,但是我看大佬们的方案都不是这么做的,想请问下这样做会有什么问题吗? 非常感谢!!!

Asking for help with the access to the dataset

Hello!
I found your repository, and it means you are taking part in the contest and deal with the data.
I had unfortunately lost my copy of training data (pet_biometric_challenge_2022.zip) - and now the access expired. Could i please ask you to share your copy, please? I will really appreciate any help. I know I am late with the contest - but I am really interested in working on the data.
Please excuse me for asking, but I feel really desperate about that - and still found no way to contact the organizers directly.
Arianna

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.