tencent / tencent-ml-images Goto Github PK
View Code? Open in Web Editor NEWLargest multi-label image database; ResNet-101 model; 80.73% top-1 acc on ImageNet
License: Other
Largest multi-label image database; ResNet-101 model; 80.73% top-1 acc on ImageNet
License: Other
Hi
I have some question about image normalization. as I notice the preprocess way is not like some standard resnet .
if we have an image in rgb order with dataformat of int8 ranging from 0~255. so are we suppose to do the nomalization as below:
image = ((image/255) -0.5 ) * 2
Thanks
sir,
Can I train with multi gpus ? Just I modify the the code in train.py
config.gpu_options.visible_device_list = str(FLAGS.train_gpus)
Does if affect the loss function computing or the final model's effect?
Is it a normal training process with this modified?
Thank you very much!
will you provide one multi-gpu version for train? I think I should modify your estimatorSpec or resnet_model_fn code, it is trouble for me specially because I am not very familiar with tensorflow.
请问imagenet数据集对应的是其中的所有数据么?还是只有ISLVRC 2012,能提供下确切的数据版本么?
Thank you for the great work!
The paper says
We firstly map the categories from both ImageNet and Open Images to the WordIDs in WordNet. According to the WordIDs, we construct the semantic hierarchy among these 11,166 categories
How did you make the mapping? And, is the mapping list available in this repository?
Like how https://github.com/Tencent/Metis uses Travis CI to get automated testing output like https://travis-ci.com/Tencent/Metis
I can not understand the loss function , can you explain it?
and if I want to reference, what should I do to prepare the train data?
thanks a lot
many urls were connection timeout, and it has a slow download speed, can I gei the original image quickly?
thank you~!
In the download script, saving images with path "save_dir + im_name" will overwrite any images with same name.
For example:
http://i.ytimg.com/vi/6rMwgpPSJyU/3.jpg 8486:1 8479:1 8473:1 5175:1 5170:1 1042:1 865:1 2:1
http://web.mit.edu/admissions/blogs/photos/jenny-whitesox/3.jpg 10591:1 1914:1 1897:1 1829:1 1054:1 1041:1 865:1 2:1
http://bp2.blogger.com/_u3lFqBksmrE/Rgoqe1STw-I/AAAAAAAACKI/sl1nY4Q4RAc/s400/3.jpg 9199:1 9170:1 8585:1 5177:1 5170:1 1042:1 865:1 2:1
....
they have the same image name.
腾讯开源代码质量堪忧啊。
还有一个问题就是,README写的够烂,踩了很多坑,train.py和finetune.py能运行tfrecord文件都不一样,一个多标签,一个多分类的代码,README里面完全没提。
另外tf1.6也不是所有的都能跑,根据cudnn的版本才行,这个也没提。
请问又遇到这个问题吗?我用自己的数据run train.py文件报错?
我训练自己的模型,但是我只用了train文件夹进行训练,没有创建val文件夹也没有对应的文件,可是查看tensorboard有一个train_accuracy曲线,从1开始往下掉,到30多万次的时候降到了0.6几,请问这是怎么回事啊
这个./ckpts是什么数据文件
sorry to bother you, I have two questions:
when calculating loss, the first step is "a. get loss coeficiente" and the corresponding codes as follows:
Whether it refers to r in loss function:
But the explanation of r is not matched with these codes,
So, can you tell me what is these codes? Especially for pos_loss_coef(0.01), neg_loss_coef(8) and loss_coef...
In train.py, record_parser_fn make image like
image = image_preprocess.preprocess_image(image=image, output_height=FLAGS.image_size, output_width=FLAGS.image_size, object_cover=0.7, area_cover=0.7, is_training=is_training,, bbox=bbox)
But in finetune.py, record_parser_fn make image like
image = image_preprocess.preprocess_image(image=image, output_height=FLAGS.image_size, output_width=FLAGS.image_size, object_cover=0.0, area_cover=0.05, is_training=is_training,, bbox=bbox)
Can you tell me why differ in object_cover and area_cover?
Thanks!
First of all, thank you for your contribution. I have two problems. First, when I use this run ./tfrecord.sh
I find that /data/images/ has prefix ,such as 22584_3941211541_85119b5dca_o.jpg. Actually we only need 3941211541_85119b5dca_o.jpg?Secondly, can I train with a small amount of new data?
The last 6,902,811 rows of train_urls.txt and the last 38,739 rows of val_urls.txt are URLs from ImageNet.
it means:
The last 6,902,811 rows of train_urls.txt and the last 38,739 rows of val_urls.txt are URLs from OpenImage?
when I load the pretrained model trained on ml-images(ckpt-resnet101-mlimages), an error occur as follow:
KeyError: u'TfplusAllreduce
my loaded code is:
saver = tf.train.import_meta_graph("./pretrain_model/ckpt-resnet101-mlimages/model.ckpt.meta")
saver.restore(sess, "./pretrain_model/ckpt-resnet101-mlimages")
my tensorflow version is 1.10.0
thank you for your advision, sincerely!
问题1:多少张图像链接在train_urls.txt
里面?我下载train_urls.txt
之后,看了一下,有17609808行,但是你们的文档说应该有17,609,752张图像才对。请问是我下载过程中出问题了,还是文档出现了错误?
问题2:train_urls.txt
中的图像和train_urls_and_index_from_imagenet.txt
中的图像到底是什么关系?没看懂文档什么意思。如果train_urls.txt
中的链接失效了,应该去train_urls_and_index_from_imagenet.txt
中找相应的有效链接吗?也就是我们需要自己根据下载时的情况监测链接是否有效?
谢谢
data/dictionary_and_semantic_hierarchy.txt有对应的中文标签吗?
will release more models' checkpoint ? for example, mobile-net ?
Thank you for your contribution. When finetune i find that the ImageNet dataset is too big. Can I build a multi-label data for finetune ?
如题,我想通过mask rcnn 进行训练,但是貌似文档里没找到怎么使用标注的?
Hi:
Great thanks for releasing such large dataset. I'm wondering how did you generate the confidence score? Is it similar to openimages, where the confidence is predicted by Google cloud vision API?
Thanks.
I wanna get 256 dimensions of feature vector, how to generate that?
I downloaded ckpt-resnet101-mlimages checkpoint and used the extract_feature.py script to extarct features as follows:
python extract_feature.py --resnet_size=101 --data_format='NCWH' --visiable_gpu=0 --pretrain_ckpt=checkpoints/ckpt-resnet101-mlimages/model.ckpt --result=result.txt --images=imglist.txt
But I got the following error:
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [2048,1000] rhs shape= [2048,11166]
[[node save/Assign_6 (defined at extract_feature.py:67) ]]
Hi,
I am wondering if it is possible to get the coordinates of the bounding boxes? I have not any idea.
Cheers,
Andi
Hi,
I am downloading the train datasets thses days. As for its a big data, I divided all urls into 34 parts. So every part may contains 20w images. Then I used your shell to download every part. But a strange thing happened, the number of invalid urls add the number of images is more than 20w. I checked it in one part, the invalid urls contain some image is downloaded successfully. I wonder have you met this situation?
如题,请问是否可以不直接下载数据进行训练,而是通过您网页中提供的Checkpoints
ckpt-resnet101-mlimages
ckpt-resnet101-mlimages-imagenet
来进行demo测试?是否辛苦提供一下具体步骤?非常感谢。
Thanks very much
:)
Thank you for your explanation for the loss function, do you have any supplement material about the loss?
I am curious about some code in the loss function, and I found some variables were not changed at all after computation, eg.
non_neg_mask = tf.fill(tf.shape(labels), -1.0, name='non_neg')
non_neg_mask = tf.cast(tf.not_equal(labels, non_neg_mask), tf.float32)
because all the value in labels is '0' or '1', and then , the non_neg_mask will be assigned with '1' all ?
And the same case with the variable pos_count, neg_count,
because the pos_count and neg_count were initialized with '0' all, and the pos_curr_count and neg_curr_count were complementary at the same position, then, is pos_count equal to pos_curr_count ,
and neg_count equal to neg_curr_count * neg_select, after follow computation?
133 pos_count = tf.assign_sub(
134 tf.assign_add(pos_count, pos_curr_count),
135 tf.multiply(pos_count, neg_curr_count))
136 neg_count = tf.assign_sub(
137 tf.assign_add(neg_count, tf.multiply(neg_curr_count, neg_select)),
138 tf.multiply(neg_count, pos_curr_count))
thank you!
Hi, would you mind sharing the training setting of ML-Images pretrain model?
I check the setting in example/train.sh and found that the BATCHSIZE is 1. There maybe something wrong in the example/train.sh.
Thanks!
你好,我看到有一个single label的例子,要是多labels如何使用
想问下这个不是最大支持1w多分类吗,有无这块的代码和例子哈
When I clicked #Contents > ##results section.
Url redirect to https://github.com/Tencent/tencent-ml-images#result URL.
However, I gently suggests that https://github.com/Tencent/tencent-ml-images#results
to jump ##Results section
I run the mutithreading shell in windows with virtual linux env, but the terminal output shows that so many image urls are invalid. 1218 images in total, but i just download 228.
In addtion,in 228 downloaded images, some images are still invalid.
how can I solve this problem and get enough images to do some research.
Didn't find requirements.txt in the repository.
I had to install the following dependencies to run example:
numpy==1.15.4
opencv==3.4.2
tensorflow-gpu==1.12.0
It should be more convenient to add dependencies explicitly to requirements file
Hi,
These days I try to download the data as your instruction. As for I already have the imagenet dataset, I just make a map with yout imagenet_index and my imagenet dataset. But after I have done the map, I find some imagenet index is repeated in your txt file. I wonder is it a right thing?
我发现train.py,损失函数是Cross_entropy, 如果多Multi-labels的图像分类,不应该是使用Binary entropy吗?望解答!
Thanks for your contribution, I read your paper and said that the verification result is in a single-label dataset. Can the ML-images dataset be multi-output tested? Is it convenient to open source if there is one?
能提供一下你们pretrain模型的数据吗?
Thank you for the great work!
1、The paper says
Note that each image from ImageNet-11K is annotated by a single tag.
Is each image just annotated by the leaf tag?
2、The paper says
Besides, as some categories from Open Images are similar to or synonyms of above 10,032 categories, we merge these redundant categories into unique categories. If all tags of one image are removed, then this image is also abandoned. Consequently, 6,902,811 training images and 38,739 validation images are remained, covering 1,134 unique categories .
Is 6,902,811 training images and 38,739 validation images in 1,134 unique categories, which is meaning if any tag of a image from Open Images is similar to or synonyms of above 10,032 categories, then the image will be removed, or in 1,134 plus 10,032 unique categories?
thanks for your reply.
问题1:多少张图像链接在train_urls.txt
里面?我下载train_urls.txt
之后,看了一下,有17609808行,但是你们的文档说应该有17,609,752张图像才对。请问是我下载过程中出问题了,还是文档出现了错误?
问题2:train_urls.txt
中的图像和train_urls_and_index_from_imagenet.txt
中的图像到底是什么关系?没看懂文档什么意思。如果train_urls.txt
中的链接失效了,应该去train_urls_and_index_from_imagenet.txt
中找相应的有效链接吗?也就是我们需要自己根据下载时的情况监测链接是否有效?
谢谢
请问imagenet数据集对应的是其中的所有数据么?还是只有ISLVRC 2012,能提供下确切的数据版本么?
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.