andreasveit / coco-text Goto Github PK

View Code? Open in Web Editor NEW

152.0 152.0 69.0 1.88 MB

COCO-Text API http://vision.cornell.edu/se3/coco-text/

Python 2.69% Jupyter Notebook 97.31%

coco-text's People

Contributors

Stargazers

Watchers

Forkers

xuhuazhe ronnie-tian tybxiaobao aicentral congyao mnorko ashwin stevenygd jugg1024 ming-hai zchengquan prasastoadi bgshih dengdan aprilyapingzhang kevinhuuu philokey sniper0110 dubvulture yingning youngkwonjo zgsxwsdxg sharonzhu flyflywang amusi xhappy simmoncn bindung fzulj qwzhong1988 yowhatever walnutmandu numanelahi mnill sabirdvd ianbyun dslasdoce darwinhang wuhuikx mineshmathew lzc1994 zm931116 diaaesmail 105318099 harirajeev jiojio1973 aishwarya1403 hakira133 feitiandemiaomi chadpieere mishtert freeworkearth mtornow pharrellwang sayakpaul ajaykrishnan23 fangpanliang kyleverkada enchanterfan cin-text-schene-detection nicole-hong

coco-text's Issues

Regarding adding a Colab Notebook

Hi @andreasveit.

I created a Colab Notebook in order for folks to visualize the COCO-text dataset more easily and readily. Would you like me to create a PR that includes this Colab Notebook?

multi-oriented detection evaluation?

Hi, may I ask obout multi-oriented detection evaluation? Do you provide evaluation method based on polygon predictions? such as (x1,y1,x2,y2,x3,y3,x4,y4)

Question about the extracted text image

Thanks for sharing. I noted that the coco text api (getAnnIds) when defining the area range, tend to give me very small size image. I had tried to put in [1000, 5000], [2000, -1] but it still gives me very very small text images (e..g 3 x 4 pixels). I presume by defining the area range, i am indicating that i desire image of certain area sizing . Am i using correctly?

Is the input in 'pixel' metric? and what is the difference in mask and bounding box? thanks

Images' filename

>>> ct = coco_text.COCO_Text('COCO_Text.json')
>>> train = [id for id in ct.imgs.keys() if 'train2014' in ct.imgs[id]['file_name']]
>>> len(train)
63686
>>> len(ct.train)
43686

Apparently every entry in imgs has a file_name field as follows:

COCO_train2014_ID.jpg

even though validation and testing images should have a different one. ( val2014 and test2014 instead of train2014)

Note that this is a problem in COCO_Text.json rather than the API itself.

Python 3 Support

The code is not compatible with python 3, if you are willing to accept PRs I would love to contribute to it 😄

MSCoco 2014 Train Images do not Match the Annotations

The website for Coco-Text (https://bgshih.github.io/cocotext/) says to download the 2014 train images

However, when I downloaded the train images from the linked website, several of the images specified in the annotation file did not exist in the folder I downloaded from MSCoco. For example, the annotation file specifies "COCO_train2014_000000540965.jpg," however the 2014 training images I downloaded did not contain this image.

Am I downloading the wrong images, or have these images been updated since the website has been updated?

No module named 'editdistance' ....

error:No module named 'editdistanc

Regarding v1's polygon vs v2's mask

Hello
Thank you for great source
I have one question!

In cocotext version1, you have code in showAnn() in coco_text.py
tl_x, tl_y, tr_x, tr_y, br_x, br_y, bl_x, bl_y = ann['polygon']
So polygon information is fixed in length (8)

On the other hand, in https://github.com/bgshih/coco-text/blob/master/coco_text.py showAnn(),
verts = list(zip(*[iter(ann['mask'])] * 2)) + [(0, 0)]
which means mask annotation has different lengths per box

Could you explain what this difference mean?

Which dataset i should download for COCO-Text annotations 2017 v1.4?

https://vision.cornell.edu/se3/coco-text-2/
http://cocodataset.org/#download

Any research paper related to this?

Any paper you have written for this contribution? Actually I need details of this code: features, accuracy, precision, architecture etc.

Problems occurred when running ct.showAnns(anns)

Hi, I have run this API on my Mac OS and everything went right until problems occurred after the line ct.showAnns(anns).
The warning is as follows:

And the result of this line is:

Is there something wrong with this code or my computer?

Do you provide the annotations of val set?

Label quality varies 2 much

Hi,
Is there any plan to release a better label for training set? The quality of current one varies too much. Some polygons are exactly same as bbox for oriented text. This is really annoying.
Thanks

some example images with typical issues
COCO_train2014_000000294914.jpg 2 different polygons covers same text differently
COCO_train2014_000000262184.jpg The polygon cant convers the text
COCO_train2014_000000131174.jpg It seems that a region is mislabeled

About 1000 test annotation

Hi, where can find the test annotations? In json files, there has only train set and val set. In ICDAR2017 official page, it says '1000 val and 1000 test'. Should I break the 2000 val into two part and change the json file by myself?

BTW: there has some kind of spelling errors in the code file and official page: some words of 'illegible' are spelled in 'illegilbe'.
https://github.com/andreasveit/coco-text/blob/master/coco_text_Demo.ipynb
and the official site https://vision.cornell.edu/se3/coco-text/

Missing annotations

This is not really a problem related to the API but rather to COCO_Text.json
I didn't know where to report this looking on the website so here I am.

While running coco_evaluation.getDetections() this is what happened:

gt_box = groundtruth.anns[gt_box_id['bbox']
KeyError: 1218650

The same holds for coco_evaluation.evaluateEndToEnd()

This is due to non-existent annotations referenced in imgToAnns.

Am I missing something or this is a real problem in the json file?

How to generate char mask for V2 of coco Text.

the website mentions
Mask Annotations
Segmentation mask is annotated for every word, allowing fine-level detection.

how to generate those ?

Cant find coco_text.json annotation file?

I downloaded this zip file
2014 Train/Val annotations 241MB
from the http://cocodataset.org/#download but i am unable to find the required COCO_Text file in the given folder.

Average Precision in evaluation script?

Seems like Coco-text - ICDAR17 is using VOC style AP as an evaluation metric, so curious why is it not supported in the evaluation API?

language 'na'

In the annotation part, language is classified by 'English or Not English or Na'.

Could you explain the meaning of 'na' ?

Can you please revisit the whole procedure on how to setup your code?

Since i believe contents on the links that you have provided has changes .. kindly provide the appropriate links or if data not available the kindly upload data on a cloud so that we down it.

Thanks

Creating tfrecord for the dataset

Hi,

I'm currently trying to train the coco-text dataset with the Tensorflow object detection API. I would like to discuss here how to create a script that allows us to interface with the TF object detection API. Please note that I'm able to parse the tfrecords generated by my own script in a graph session. With the TF Object detection API I end up getting:

ConcatOp : Dimensions of inputs should match: shape[0] = [1,46] vs. shape[1] = [1,23]
	 [[Node: concat = ConcatV2[N=4, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ExpandDims_1, ExpandDims_2, ExpandDims_3, Equal_4/y)]]

I'm wondering what could be the issue and how to solve it, since the error doesn't say much.

about icdar 2017 cocotext task2 Cropped words dataset

I don't understand why there is a '|' in the annotation file such as "1221677,|EBB,| " , but I can't see a '|' in the corresponding image?

dataset download

Hello, sorry if my question seems to be silly but I can't find out how to download the coco-text dataset images?